No Arabic abstract
Mobile phone calling is one of the most widely used communication methods in modern society. The records of calls among mobile phone users provide us a valuable proxy for the understanding of human communication patterns embedded in social networks. Mobile phone users call each other forming a directed calling network. If only reciprocal calls are considered, we obtain an undirected mutual calling network. The preferential communication behavior between two connected users can be statistically tested and it results in two Bonferroni networks with statistically validated edges. We perform a comparative analysis of the statistical properties of these four networks, which are constructed from the calling records of more than nine million individuals in Shanghai over a period of 110 days. We find that these networks share many common structural properties and also exhibit idiosyncratic features when compared with previously studied large mobile calling networks. The empirical findings provide us an intriguing picture of a representative large social network that might shed new lights on the modelling of large social networks.
Measuring graph clustering quality remains an open problem. To address it, we introduce quality measures based on comparisons of intra- and inter-cluster densities, an accompanying statistical test of the significance of their differences and a step-by-step routine for clustering quality assessment. Our null hypothesis does not rely on any generative model for the graph, unlike modularity which uses the configuration model as a null model. Our measures are shown to meet the axioms of a good clustering quality function, unlike the very commonly used modularity measure. They also have an intuitive graph-theoretic interpretation, a formal statistical interpretation and can be easily tested for significance. Our work is centered on the idea that well clustered graphs will display a significantly larger intra-cluster density than inter-cluster density. We develop tests to validate the existence of such a cluster structure. We empirically explore the behavior of our measures under a number of stress test scenarios and compare their behavior to the commonly used modularity and conductance measures. Empirical stress test results confirm that our measures compare very favorably to the established ones. In particular, they are shown to be more responsive to graph structure and less sensitive to sample size and breakdowns during numerical implementation and less sensitive to uncertainty in connectivity. These features are especially important in the context of larger data sets or when the data may contain errors in the connectivity patterns.
We study the dynamic network of relationships among avatars in the massively multiplayer online game Planetside 2. In the spring of 2014, two separate servers of this game were merged, and as a result, two previously distinct networks were combined into one. We observed the evolution of this network in the seven month period following the merger and report our observations. We found that some structures of original networks persist in the combined network for a long time after the merger. As the original avatars are gradually removed, these structures slowly dissolve, but they remain observable for a surprisingly long time. We present a number of visualizations illustrating the post-merger dynamics and discuss time evolution of selected quantities characterizing the topology of the network.
A major problem in the study of complex socioeconomic systems is represented by privacy issues$-$that can put severe limitations on the amount of accessible information, forcing to build models on the basis of incomplete knowledge. In this paper we investigate a novel method to reconstruct global topological properties of a complex network starting from limited information. This method uses the knowledge of an intrinsic property of the nodes (indicated as fitness), and the number of connections of only a limited subset of nodes, in order to generate an ensemble of exponential random graphs that are representative of the real systems and that can be used to estimate its topological properties. Here we focus in particular on reconstructing the most basic properties that are commonly used to describe a network: density of links, assortativity, clustering. We test the method on both benchmark synthetic networks and real economic and financial systems, finding a remarkable robustness with respect to the number of nodes used for calibration. The method thus represents a valuable tool for gaining insights on privacy-protected systems.
Degree distribution of nodes, especially a power law degree distribution, has been regarded as one of the most significant structural characteristics of social and information networks. Node degree, however, only discloses the first-order structure of a network. Higher-order structures such as the edge embeddedness and the size of communities may play more important roles in many online social networks. In this paper, we provide empirical evidence on the existence of rich higherorder structural characteristics in online social networks, develop mathematical models to interpret and model these characteristics, and discuss their various applications in practice. In particular, 1) We show that the embeddedness distribution of social links in many social networks has interesting and rich behavior that cannot be captured by well-known network models. We also provide empirical results showing a clear correlation between the embeddedness distribution and the average number of messages communicated between pairs of social network nodes. 2) We formally prove that random k-tree, a recent model for complex networks, has a power law embeddedness distribution, and show empirically that the random k-tree model can be used to capture the rich behavior of higherorder structures we observed in real-world social networks. 3) Going beyond the embeddedness, we show that a variant of the random k-tree model can be used to capture the power law distribution of the size of communities of overlapping cliques discovered recently.
The newly released Orange D4D mobile phone data base provides new insights into the use of mobile technology in a developing country. Here we perform a series of spatial data analyses that reveal important geographic aspects of mobile phone use in Cote dIvoire. We first map the locations of base stations with respect to the population distribution and the number and duration of calls at each base station. On this basis, we estimate the energy consumed by the mobile phone network. Finally, we perform an analysis of inter-city mobility, and identify high-traffic roads in the country.