No Arabic abstract
The full range of activity in a temporal network is captured in its edge activity data -- time series encoding the tie strengths or on-off dynamics of each edge in the network. However, in many practical applications, edge-level data are unavailable, and the network analyses must rely instead on node activity data which aggregates the edge-activity data and thus is less informative. This raises the question: Is it possible to use the static network to recover the richer edge activities from the node activities? Here we show that recovery is possible, often with a surprising degree of accuracy given how much information is lost, and that the recovered data are useful for subsequent network analysis tasks. Recovery is more difficult when network density increases, either topologically or dynamically, but exploiting dynamical and topological sparsity enables effective solutions to the recovery problem. We formally characterize the difficulty of the recovery problem both theoretically and empirically, proving the conditions under which recovery errors can be bounded and showing that, even when these conditions are not met, good quality solutions can still be derived. Effective recovery carries both promise and peril, as it enables deeper scientific study of complex systems but in the context of social systems also raises privacy concerns when social information can be aggregated across multiple data sources.
For the study of information propagation, one fundamental problem is uncovering universal laws governing the dynamics of information propagation. This problem, from the microscopic perspective, is formulated as estimating the propagation probability that a piece of information propagates from one individual to another. Such a propagation probability generally depends on two major classes of factors: the intrinsic attractiveness of information and the interactions between individuals. Despite the fact that the temporal effect of attractiveness is widely studied, temporal laws underlying individual interactions remain unclear, causing inaccurate prediction of information propagation on evolving social networks. In this report, we empirically study the dynamics of information propagation, using the dataset from a population-scale social media website. We discover a temporal scaling in information propagation: the probability a message propagates between two individuals decays with the length of time latency since their latest interaction, obeying a power-law rule. Leveraging the scaling law, we further propose a temporal model to estimate future propagation probabilities between individuals, reducing the error rate of information propagation prediction from 6.7% to 2.6% and improving viral marketing with 9.7% incremental customers.
Social networks play a fundamental role in the diffusion of information. However, there are two different ways of how information reaches a person in a network. Information reaches us through connections in our social networks, as well as through the influence of external out-of-network sources, like the mainstream media. While most present models of information adoption in networks assume information only passes from a node to node via the edges of the underlying network, the recent availability of massive online social media data allows us to study this process in more detail. We present a model in which information can reach a node via the links of the social network or through the influence of external sources. We then develop an efficient model parameter fitting technique and apply the model to the emergence of URL mentions in the Twitter network. Using a complete one month trace of Twitter we study how information reaches the nodes of the network. We quantify the external influences over time and describe how these influences affect the information adoption. We discover that the information tends to jump across the network, which can only be explained as an effect of an unobservable external influence on the network. We find that only about 71% of the information volume in Twitter can be attributed to network diffusion, and the remaining 29% is due to external events and factors outside the network.
Online social networks are often subject to influence campaigns by malicious actors through the use of automated accounts known as bots. We consider the problem of detecting bots in online social networks and assessing their impact on the opinions of individuals. We begin by analyzing the behavior of bots in social networks and identify that they exhibit heterophily, meaning they interact with humans more than other bots. We use this property to develop a detection algorithm based on the Ising model from statistical physics. The bots are identified by solving a minimum cut problem. We show that this Ising model algorithm can identify bots with higher accuracy while utilizing much less data than other state of the art methods. We then develop a a function we call generalized harmonic influence centrality to estimate the impact bots have on the opinions of users in social networks. This function is based on a generalized opinion dynamics model and captures how the activity level and network connectivity of the bots shift equilibrium opinions. To apply generalized harmonic influence centrality to real social networks, we develop a deep neural network to measure the opinions of users based on their social network posts. Using this neural network, we then calculate the generalized harmonic influence centrality of bots in multiple real social networks. For some networks we find that a limited number of bots can cause non-trivial shifts in the population opinions. In other networks, we find that the bots have little impact. Overall we find that generalized harmonic influence centrality is a useful operational tool to measure the impact of bots in social networks.
In both classical and quantum world, information cannot appear or disappear. This fundamental principle, however, is questioned for a black hole, by the acclaimed information loss paradox. Based on the conservation laws of energy, charge, and angular momentum, we recently show the total information encoded in the correlations among Hawking radiations equals exactly to the same amount previously considered lost, assuming the non-thermal spectrum of Parikh and Wilczek. Thus the information loss paradox can be falsified through experiments by detecting correlations, for instance, through measuring the covariances of Hawking radiations from black holes, such as the manmade ones speculated to appear in LHC experiments. The affirmation of information conservation in Hawking radiation will shine new light on the unification of gravity with quantum mechanics.
In transportation, communication, social and other real complex networks, some critical edges act a pivotal part in controlling the flow of information and maintaining the integrity of the structure. Due to the importance of critical edges in theoretical studies and practical applications, the identification of critical edges gradually become a hot topic in current researches. Considering the overlap of communities in the neighborhood of edges, a novel and effective metric named subgraph overlap (SO) is proposed to quantifying the significance of edges. The experimental results show that SO outperforms all benchmarks in identifying critical edges which are crucial in maintaining the integrity of the structure and functions of networks.