No Arabic abstract
A number of predictors have been suggested to detect the most influential spreaders of information in online social media across various domains such as Twitter or Facebook. In particular, degree, PageRank, k-core and other centralities have been adopted to rank the spreading capability of users in information dissemination media. So far, validation of the proposed predictors has been done by simulating the spreading dynamics rather than following real information flow in social networks. Consequently, only model-dependent contradictory results have been achieved so far for the best predictor. Here, we address this issue directly. We search for influential spreaders by following the real spreading dynamics in a wide range of networks. We find that the widely-used degree and PageRank fail in ranking users influence. We find that the best spreaders are consistently located in the k-core across dissimilar social platforms such as Twitter, Facebook, Livejournal and scientific publishing in the American Physical Society. Furthermore, when the complete global network structure is unavailable, we find that the sum of the nearest neighbors degree is a reliable local proxy for users influence. Our analysis provides practical instructions for optimal design of strategies for viral information dissemination in relevant applications.
Although the many forms of modern social media have become major channels for the dissemination of information, they are becoming overloaded because of the rapidly-expanding number of information feeds. We analyze the expanding user-generated content in Sina Weibo, the largest micro-blog site in China, and find evidence that popular messages often follow a mechanism that differs from that found in the spread of disease, in contrast to common believe. In this mechanism, an individual with more friends needs more repeated exposures to spread further the information. Moreover, our data suggest that in contrast to epidemics, for certain messages the chance of an individual to share the message is proportional to the fraction of its neighbours who shared it with him/her. Thus the greater the number of friends an individual has the greater the number of repeated contacts needed to spread the message, which is a result of competition for attention. We model this process using a fractional susceptible infected recovered (FSIR) model, where the infection probability of a node is proportional to its fraction of infected neighbors. Our findings have dramatic implications for information contagion. For example, using the FSIR model we find that real-world social networks have a finite epidemic threshold. This is in contrast to the zero threshold that conventional wisdom derives from disease epidemic models. This means that when individuals are overloaded with excess information feeds, the information either reaches out the population if it is above the critical epidemic threshold, or it would never be well received, leading to only a handful of information contents that can be widely spread throughout the population.
Many real-world networks are known to exhibit facts that counter our knowledge prescribed by the theories on network creation and communication patterns. A common prerequisite in network analysis is that information on nodes and links will be complete because network topologies are extremely sensitive to missing information of this kind. Therefore, many real-world networks that fail to meet this criterion under random sampling may be discarded. In this paper we offer a framework for interpreting the missing observations in network data under the hypothesis that these observations are not missing at random. We demonstrate the methodology with a case study of a financial trade network, where the awareness of agents to the data collection procedure by a self-interested observer may result in strategic revealing or withholding of information. The non-random missingness has been overlooked despite the possibility of this being an important feature of the processes by which the network is generated. The analysis demonstrates that strategic information withholding may be a valid general phenomenon in complex systems. The evidence is sufficient to support the existence of an influential observer and to offer a compelling dynamic mechanism for the creation of the network.
We study the Axelrods cultural adaptation model using the concept of cluster size entropy, $S_{c}$ that gives information on the variability of the cultural cluster size present in the system. Using networks of different topologies, from regular to random, we find that the critical point of the well-known nonequilibrium monocultural-multicultural (order-disorder) transition of the Axelrod model is unambiguously given by the maximum of the $S_{c}(q)$ distributions. The width of the cluster entropy distributions can be used to qualitatively determine whether the transition is first- or second-order. By scaling the cluster entropy distributions we were able to obtain a relationship between the critical cultural trait $q_c$ and the number $F$ of cultural features in regular networks. We also analyze the effect of the mass media (external field) on social systems within the Axelrod model in a square network. We find a new partially ordered phase whose largest cultural cluster is not aligned with the external field, in contrast with a recent suggestion that this type of phase cannot be formed in regular networks. We draw a new $q-B$ phase diagram for the Axelrod model in regular networks.
Recent wide-spread adoption of electronic and pervasive technologies has enabled the study of human behavior at an unprecedented level, uncovering universal patterns underlying human activity, mobility, and inter-personal communication. In the present work, we investigate whether deviations from these universal patterns may reveal information about the socio-economical status of geographical regions. We quantify the extent to which deviations in diurnal rhythm, mobility patterns, and communication styles across regions relate to their unemployment incidence. For this we examine a country-scale publicly articulated social media dataset, where we quantify individual behavioral features from over 145 million geo-located messages distributed among more than 340 different Spanish economic regions, inferred by computing communities of cohesive mobility fluxes. We find that regions exhibiting more diverse mobility fluxes, earlier diurnal rhythms, and more correct grammatical styles display lower unemployment rates. As a result, we provide a simple model able to produce accurate, easily interpretable reconstruction of regional unemployment incidence from their social-media digital fingerprints alone. Our results show that cost-effective economical indicators can be built based on publicly-available social media datasets.
Daily interactions naturally define social circles. Individuals tend to be friends with the people they spend time with and they choose to spend time with their friends, inextricably entangling physical location and social relationships. As a result, it is possible to predict not only someones location from their friends locations but also friendship from spatial and temporal co-occurrence. While several models have been developed to separately describe mobility and the evolution of social networks, there is a lack of studies coupling social interactions and mobility. In this work, we introduce a new model that bridges this gap by explicitly considering the feedback of mobility on the formation of social ties. Data coming from three online social networks (Twitter, Gowalla and Brightkite) is used for validation. Our model reproduces various topological and physical properties of these networks such as: i) the size of the connected components, ii) the distance distribution between connected users, iii) the dependence of the reciprocity on the distance, iv) the variation of the social overlap and the clustering with the distance. Besides numerical simulations, a mean-field approach is also used to study analytically the main statistical features of the networks generated by the model. The robustness of the results to changes in the model parameters is explored, finding that a balance between friend visits and long-range random connections is essential to reproduce the geographical features of the empirical networks.