Estimating the outcome of spreading processes on networks with incomplete information: a mesoscale approach


Abstract in English

Recent advances in data collection have facilitated the access to time-resolved human proximity data that can conveniently be represented as temporal networks of contacts between individuals. While this type of data is fundamental to investigate how information or diseases propagate in a population, it often suffers from incompleteness, which possibly leads to biased conclusions. A major challenge is thus to estimate the outcome of spreading processes occurring on temporal networks built from partial information. To cope with this problem, we devise an approach based on Non-negative Tensor Factorization (NTF) -- a dimensionality reduction technique from multi-linear algebra. The key idea is to learn a low-dimensional representation of the temporal network built from partial information, to adapt it to take into account temporal and structural heterogeneity properties known to be crucial for spreading processes occurring on networks, and to construct in this way a surrogate network similar to the complete original network. To test our method, we consider several human-proximity networks, on which we simulate a loss of data. Using our approach on the resulting partial networks, we build a surrogate version of the complete network for each. We then compare the outcome of a spreading process on the complete networks (non altered by a loss of data) and on the surrogate networks. We observe that the epidemic sizes obtained using the surrogate networks are in good agreement with those measured on the complete networks. Finally, we propose an extension of our framework when additional data sources are available to cope with the missing data problem.

Download