No Arabic abstract
The topological structure of complex networks has fascinated researchers for several decades, resulting in the discovery of many universal properties and reoccurring characteristics of different kinds of networks. However, much less is known today about the network dynamics: indeed, complex networks in reality are not static, but rather dynamically evolve over time. Our paper is motivated by the empirical observation that network evolution patterns seem far from random, but exhibit structure. Moreover, the specific patterns appear to depend on the network type, contradicting the existence of a one fits it all model. However, we still lack observables to quantify these intuitions, as well as metrics to compare graph evolutions. Such observables and metrics are needed for extrapolating or predicting evolutions, as well as for interpolating graph evolutions. To explore the many faces of graph dynamics and to quantify temporal changes, this paper suggests to build upon the concept of centrality, a measure of node importance in a network. In particular, we introduce the notion of centrality distance, a natural similarity measure for two graphs which depends on a given centrality, characterizing the graph type. Intuitively, centrality distances reflect the extent to which (non-anonymous) node roles are different or, in case of dynamic graphs, have changed over time, between two graphs. We evaluate the centrality distance approach for five evolutionary models and seven real-world social and physical networks. Our results empirically show the usefulness of centrality distances for characterizing graph dynamics compared to a null-model of random evolution, and highlight the differences between the considered scenarios. Interestingly, our approach allows us to compare the dynamics of very different networks, in terms of scale and evolution speed.
The topological structure of complex networks has fascinated researchers for several decades, resulting in the discovery of many universal properties and reoccurring characteristics of different kinds of networks. However, much less is known today about the network dynamics: indeed, complex networks in reality are not static, but rather dynamically evolve over time. Our paper is motivated by the empirical observation that network evolution patterns seem far from random, but exhibit structure. Moreover, the specific patterns appear to depend on the network type, contradicting the existence of a one fits it all model. However, we still lack observables to quantify these intuitions, as well as metrics to compare graph evolutions. Such observables and metrics are needed for extrapolating or predicting evolutions, as well as for interpolating graph evolutions. To explore the many faces of graph dynamics and to quantify temporal changes, this paper suggests to build upon the concept of centrality, a measure of node importance in a network. In particular, we introduce the notion of centrality distance, a natural similarity measure for two graphs which depends on a given centrality, characterizing the graph type. Intuitively, centrality distances reflect the extent to which (non-anonymous) node roles are different or, in case of dynamic graphs, have changed over time, between two graphs. We evaluate the centrality distance approach for five evolutionary models and seven real-world social and physical networks. Our results empirically show the usefulness of centrality distances for characterizing graph dynamics compared to a null-model of random evolution, and highlight the differences between the considered scenarios. Interestingly, our approach allows us to compare the dynamics of very different networks, in terms of scale and evolution speed.
Most past work on social network link fraud detection tries to separate genuine users from fraudsters, implicitly assuming that there is only one type of fraudulent behavior. But is this assumption true? And, in either case, what are the characteristics of such fraudulent behaviors? In this work, we set up honeypots (dummy social network accounts), and buy fake followers (after careful IRB approval). We report the signs of such behaviors including oddities in local network connectivity, account attributes, and similarities and differences across fraud providers. Most valuably, we discover and characterize several types of fraud behaviors. We discuss how to leverage our insights in practice by engineering strongly performing entropy-based features and demonstrating high classification accuracy. Our contributions are (a) instrumentation: we detail our experimental setup and carefully engineered data collection process to scrape Twitter data while respecting API rate-limits, (b) observations on fraud multimodality: we analyze our honeypot fraudster ecosystem and give surprising insights into the multifaceted behaviors of these fraudster types, and (c) features: we propose novel features that give strong (>0.95 precision/recall) discriminative power on ground-truth Twitter data.
Recently, many systems for graph analysis have been developed to address the growing needs of both industry and academia to study complex graphs. Insight into the practical uses of graph analysis will allow future developments of such systems to optimize for real-world usage, instead of targeting single use cases or hypothetical workloads. This insight may be derived from surveys on the applications of graph analysis. However, existing surveys are limited in the variety of application domains, datasets, and/or graph analysis techniques they study. In this work we present and apply a systematic method for identifying practical use cases of graph analysis. We identify commonly used graph features and analysis methods and use our findings to construct a taxonomy of graph analysis applications. We conclude that practical use cases of graph analysis cover a diverse set of graph features and analysis methods. Furthermore, most applications combine multiple features and methods. Our findings motivate further development of graph analysis systems to support a broader set of applications and to facilitate the combination of multiple analysis methods in an (interactive) workflow.
Graph models, like other machine learning models, have implicit and explicit biases built-in, which often impact performance in nontrivial ways. The models faithfulness is often measured by comparing the newly generated graph against the source graph using any number or combination of graph properties. Differences in the size or topology of the generated graph therefore indicate a loss in the model. Yet, in many systems, errors encoded in loss functions are subtle and not well understood. In the present work, we introduce the Infinity Mirror test for analyzing the robustness of graph models. This straightforward stress test works by repeatedly fitting a model to its own outputs. A hypothetically perfect graph model would have no deviation from the source graph; however, the models implicit biases and assumptions are exaggerated by the Infinity Mirror test, exposing potential issues that were previously obscured. Through an analysis of thousands of experiments on synthetic and real-world graphs, we show that several conventional graph models degenerate in exciting and informative ways. We believe that the observed degenerative patterns are clues to the future development of better graph models.
In online social media systems users are not only posting, consuming, and resharing content, but also creating new and destroying existing connections in the underlying social network. While each of these two types of dynamics has individually been studied in the past, much less is known about the connection between the two. How does user information posting and seeking behavior interact with the evolution of the underlying social network structure? Here, we study ways in which network structure reacts to users posting and sharing content. We examine the complete dynamics of the Twitter information network, where users post and reshare information while they also create and destroy connections. We find that the dynamics of network structure can be characterized by steady rates of change, interrupted by sudden bursts. Information diffusion in the form of cascades of post re-sharing often creates such sudden bursts of new connections, which significantly change users local network structure. These bursts transform users networks of followers to become structurally more cohesive as well as more homogenous in terms of follower interests. We also explore the effect of the information content on the dynamics of the network and find evidence that the appearance of new topics and real-world events can lead to significant changes in edge creations and deletions. Lastly, we develop a model that quantifies the dynamics of the network and the occurrence of these bursts as a function of the information spreading through the network. The model can successfully predict which information diffusion events will lead to bursts in network dynamics.