No Arabic abstract
Studies on friendships in online social networks involving geographic distance have so far relied on the city location provided in users profiles. Consequently, most of the research on friendships have provided accuracy at the city level, at best, to designate a users location. This study analyzes a Twitter dataset because it provides the exact geographic distance between corresponding users. We start by introducing a strong definition of friend on Twitter (i.e., a definition of bidirectional friendship), requiring bidirectional communication. Next, we utilize geo-tagged mentions delivered by users to determine their locations, where @username is contained anywhere in the body of tweets. To provide analysis results, we first introduce a friend counting algorithm. From the fact that Twitter users are likely to post consecutive tweets in the static mode, we also introduce a two-stage distance estimation algorithm. As the first of our main contributions, we verify that the number of friends of a particular Twitter user follows a well-known power-law distribution (i.e., a Zipfs distribution or a Pareto distribution). Our study also provides the following newly-discovered friendship degree related to the issue of space: The number of friends according to distance follows a double power-law (i.e., a double Pareto law) distribution, indicating that the probability of befriending a particular Twitter user is significantly reduced beyond a certain geographic distance between users, termed the separation point. Our analysis provides concrete evidence that Twitter can be a useful platform for assigning a more accurate scalar value to the degree of friendship between two users.
This study analyzes friendships in online social networks involving geographic distance with a geo-referenced Twitter dataset, which provides the exact distance between corresponding users. We start by introducing a strong definition of friend on Twitter, requiring bidirectional communication. Next, by utilizing geo-tagged mentions delivered by users to determine their locations, we introduce a two-stage distance estimation algorithm. As our main contribution, our study provides the following newly-discovered friendship degree related to the issue of space: The number of friends according to distance follows a double power-law (i.e., a double Pareto law) distribution, indicating that the probability of befriending a particular Twitter user is significantly reduced beyond a certain geographic distance between users, termed the separation point. Our analysis provides much more fine-grained social ties in space, compared to the conventional results showing a homogeneous power-law with distance.
Users on Twitter are commonly identified by their profile names. These names are used when directly addressing users on Twitter, are part of their profile page URLs, and can become a trademark for popular accounts, with people referring to celebrities by their real name and their profile name, interchangeably. Twitter, however, has chosen to not permanently link profile names to their corresponding user accounts. In fact, Twitter allows users to change their profile name, and afterwards makes the old profile names available for other users to take. In this paper, we provide a large-scale study of the phenomenon of profile name reuse on Twitter. We show that this phenomenon is not uncommon, investigate the dynamics of profile name reuse, and characterize the accounts that are involved in it. We find that many of these accounts adopt abandoned profile names for questionable purposes, such as spreading malicious content, and using the profile names popularity for search engine optimization. Finally, we show that this problem is not unique to Twitter (as other popular online social networks also release profile names) and argue that the risks involved with profile-name reuse outnumber the advantages provided by this feature.
We present an open-source interface for scientists to explore Twitter data through interactive network visualizations. Combining data collection, transformation and visualization in one easily accessible framework, the twitter explorer connects distant and close reading of Twitter data through the interactive exploration of interaction networks and semantic networks. By lowering the technological barriers of data-driven research, it aims to attract researchers from various disciplinary backgrounds and facilitates new perspectives in the thriving field of computational social science.
As a fundamental challenge in vast disciplines, link prediction aims to identify potential links in a network based on the incomplete observed information, which has broad applications ranging from uncovering missing protein-protein interaction to predicting the evolution of networks. One of the most influential methods rely on similarity indices characterized by the common neighbors or its variations. We construct a hidden space mapping a network into Euclidean space based solely on the connection structures of a network. Compared with real geographical locations of nodes, our reconstructed locations are in conformity with those real ones. The distances between nodes in our hidden space could serve as a novel similarity metric in link prediction. In addition, we hybrid our hidden space method with other state-of-the-art similarity methods which substantially outperforms the existing methods on the prediction accuracy. Hence, our hidden space reconstruction model provides a fresh perspective to understand the network structure, which in particular casts a new light on link prediction.
Community detection is a significant and challenging task in network research. Nowadays, plenty of attention has been focused on local methods of community detection. Among them, community detection with a greedy algorithm typically starts from the identification of local essential nodes called central nodes of the network; communities expand later from these central nodes by optimizing a modularity function. In this paper, we propose a new central node indicator and a new modularity function. Our central node indicator, which we call local centrality indicator (LCI), is as efficient as the well-known global maximal degree indicator and local maximal degree indicator; on certain special network structure, LCI performs even better. On the other hand, our modularity function F2 overcomes certain disadvantages,such as the resolution limit problem,of the modularity functions raised in previous literature. Combined with a greedy algorithm, LCI and F2 enable us to identify the right community structures for both the real world networks and the simulated benchmark network. Evaluation based on the normalized mutual information (NMI) suggests that our community detection method with a greedy algorithm based on LCI and F2 performs superior to many other methods. Therefore, the method we proposed in this paper is potentially noteworthy.