No Arabic abstract
In many data sets, crucial elements co-exist with non-essential ones and noise. For data represented as networks in particular, several methods have been proposed to extract a network backbone, i.e., the set of most important links. However, the question of how the resulting compressed views of the data can effectively be used has not been tackled. Here we address this issue by putting forward and exploring several systematic procedures to build surrogate data from various kinds of temporal network backbones. In particular, we explore how much information about the original data need to be retained alongside the backbone so that the surrogate data can be used in data-driven numerical simulations of spreading processes. We illustrate our results using empirical temporal networks with a broad variety of structures and properties.
Much effort has been devoted to understand how temporal network features and the choice of the source node affect the prevalence of a diffusion process. In this work, we addressed the further question: node pairs with what kind of local and temporal connection features tend to appear in a diffusion trajectory or path, thus contribute to the actual information diffusion. We consider the Susceptible-Infected spreading process with a given infection probability per contact on a large number of real-world temporal networks. We illustrate how to construct the information diffusion backbone where the weight of each link tells the probability that a node pair appears in a diffusion process starting from a random node. We unravel how these backbones corresponding to different infection probabilities relate to each other and point out the importance of two extreme backbones: the backbone with infection probability one and the integrated network, between which other backbones vary. We find that the temporal node pair feature that we proposed could better predict the links in the extreme backbone with infection probability one as well as the high weight links than the features derived from the integrated network. This universal finding across all the empirical networks highlights that temporal information are crucial in determining a node pairs role in a diffusion process. A node pair with many early contacts tends to appear in a diffusion process. Our findings shed lights on the in-depth understanding and may inspire the control of information spread.
We present an integrated approach to analyse the multi-lead ECG data using the frame work of multiplex recurrence networks (MRNs). We explore how their intralayer and interlayer topological features can capture the subtle variations in the recurrence patterns of the underlying spatio-temporal dynamics. We find MRNs from ECG data of healthy cases are significantly more coherent with high mutual information and less divergence between respective degree distributions. In cases of diseases, significant differences in specific measures of similarity between layers are seen. The coherence is affected most in the cases of diseases associated with localized abnormality such as bundle branch block. We note that it is important to do a comprehensive analysis using all the measures to arrive at disease-specific patterns. Our approach is very general and as such can be applied in any other domain where multivariate or multi-channel data are available from highly complex systems.
Networks are well-established representations of social systems, and temporal networks are widely used to study their dynamics. Temporal network data often consist in a succession of static networks over consecutive time windows whose length, however, is arbitrary, not necessarily corresponding to any intrinsic timescale of the system. Moreover, the resulting view of social network evolution is unsatisfactory: short time windows contain little information, whereas aggregating over large time windows blurs the dynamics. Going from a temporal network to a meaningful evolving representation of a social network therefore remains a challenge. Here we introduce a framework to that purpose: transforming temporal network data into an evolving weighted network where the weights of the links between individuals are updated at every interaction. Most importantly, this transformation takes into account the interdependence of social relationships due to the finite attention capacities of individuals: each interaction between two individuals not only reinforces their mutual relationship but also weakens their relationships with others. We study a concrete example of such a transformation and apply it to several data sets of social interactions. Using temporal contact data collected in schools, we show how our framework highlights specificities in their structure and temporal organization. We then introduce a synthetic perturbation into a data set of interactions in a group of baboons to show that it is possible to detect a perturbation in a social group on a wide range of timescales and parameters. Our framework brings new perspectives to the analysis of temporal social networks.
In this paper we analyse the bipartite Colombian firms-products network, throughout a period of five years, from 2010 to 2014. Our analysis depicts a strongly modular system, with several groups of firms specializing in the export of specific categories of products. These clusters have been detected by running the bipartite variant of the traditional modularity maximization, revealing a bi-modular structure. Interestingly, this finding is refined by applying a recently-proposed algorithm for projecting bipartite networks on the layer of interest and, then, running the Louvain algorithm on the resulting monopartite representations. Important structural differences emerge upon comparing the Colombian firms-products network with the World Trade Web, in particular, the bipartite representation of the latter is not characterized by a similar block-structure, as the modularity maximization fails in revealing (bipartite) nodes clusters. This points out that economic systems behave differently at different scales: while countries tend to diversify their production --potentially exporting a large number of different products-- firms specialize in exporting (substantially very limited) baskets of basically homogeneous products.
Large-scale research endeavors can be hindered by logistical constraints limiting the amount of available data. For example, global ecological questions require a global dataset, and traditional sampling protocols are often too inefficient for a small research team to collect an adequate amount of data. Citizen science offers an alternative by crowdsourcing data collection. Despite growing popularity, the community has been slow to embrace it largely due to concerns about quality of data collected by citizen scientists. Using the citizen science project Floating Forests (http://floatingforests.org), we show that consensus classifications made by citizen scientists produce data that is of comparable quality to expert generated classifications. Floating Forests is a web-based project in which citizen scientists view satellite photographs of coastlines and trace the borders of kelp patches. Since launch in 2014, over 7,000 citizen scientists have classified over 750,000 images of kelp forests largely in California and Tasmania. Images are classified by 15 users. We generated consensus classifications by overlaying all citizen classifications and assessed accuracy by comparing to expert classifications. Matthews correlation coefficient (MCC) was calculated for each threshold (1-15), and the threshold with the highest MCC was considered optimal. We showed that optimal user threshold was 4.2 with an MCC of 0.400 (0.023 SE) for Landsats 5 and 7, and a MCC of 0.639 (0.246 SE) for Landsat 8. These results suggest that citizen science data derived from consensus classifications are of comparable accuracy to expert classifications. Citizen science projects should implement methods such as consensus classification in conjunction with a quantitative comparison to expert generated classifications to avoid concerns about data quality.