No Arabic abstract
Social groups play a crucial role in social media platforms because they form the basis for user participation and engagement. Groups are created explicitly by members of the community, but also form organically as members interact. Due to their importance, they have been studied widely (e.g., community detection, evolution, activity, etc.). One of the key questions for understanding how such groups evolve is whether there are different types of groups and how they differ. In Sociology, theories have been proposed to help explain how such groups form. In particular, the common identity and common bond theory states that people join groups based on identity (i.e., interest in the topics discussed) or bond attachment (i.e., social relationships). The theory has been applied qualitatively to small groups to classify them as either topical or social. We use the identity and bond theory to define a set of features to classify groups into those two categories. Using a dataset from Flickr, we extract user-defined groups and automatically-detected groups, obtained from a community detection algorithm. We discuss the process of manual labeling of groups into social or topical and present results of predicting the group label based on the defined features. We directly validate the predictions of the theory showing that the metrics are able to forecast the group type with high accuracy. In addition, we present a comparison between declared and detected groups along topicality and sociality dimensions.
A number of recent studies of information diffusion in social media, both empirical and theoretical, have been inspired by viral propagation models derived from epidemiology. These studies model the propagation of memes, i.e., pieces of information, between users in a social network similarly to the way diseases spread in human society. Importantly, one would expect a meme to spread in a social network amongst the people who are interested in the topic of that meme. Yet, the importance of topicality for information diffusion has been less explored in the literature. Here, we study empirical data about two different types of memes (hashtags and URLs) spreading through the Twitters online social network. For every meme, we infer its topics and for every user, we infer her topical interests. To analyze the impact of such topics on the propagation of memes, we introduce a novel theoretical framework of information diffusion. Our analysis identifies two distinct mechanisms, namely topical and non-topical, of information diffusion. The non-topical information diffusion resembles disease spreading as in simple contagion. In contrast, the topical information diffusion happens between users who are topically aligned with the information and has characteristics of complex contagion. Non-topical memes spread broadly among all users and end up being relatively popular. Topical memes spread narrowly among users who have interests topically aligned with them and are diffused more readily after multiple exposures. Our results show that the topicality of memes and users interests are essential for understanding and predicting information diffusion.
Peoples interests and peoples social relationships are intuitively connected, but understanding their interplay and whether they can help predict each other has remained an open question. We examine the interface of two decisive structures forming the backbone of online social media: the graph structure of social networks - who connects with whom - and the set structure of topical affiliations - who is interested in what. In studying this interface, we identify key relationships whereby each of these structures can be understood in terms of the other. The context for our analysis is Twitter, a complex social network of both follower relationships and communication relationships. On Twitter, hashtags are used to label conversation topics, and we examine hashtag usage alongside these social structures. We find that the hashtags that users adopt can predict their social relationships, and also that the social relationships between the initial adopters of a hashtag can predict the future popularity of that hashtag. By studying weighted social relationships, we observe that while strong reciprocated ties are the easiest to predict from hashtag structure, they are also much less useful than weak directed ties for predicting hashtag popularity. Importantly, we show that computationally simple structural determinants can provide remarkable performance in both tasks. While our analyses focus on Twitter, we view our findings as broadly applicable to topical affiliations and social relationships in a host of diverse contexts, including the movies people watch, the brands people like, or the locations people frequent.
There has been a tremendous rise in the growth of online social networks all over the world in recent years. It has facilitated users to generate a large amount of real-time content at an incessant rate, all competing with each other to attract enough attention and become popular trends. While Western online social networks such as Twitter have been well studied, the popular Chinese microblogging network Sina Weibo has had relatively lower exposure. In this paper, we analyze in detail the temporal aspect of trends and trend-setters in Sina Weibo, contrasting it with earlier observations in Twitter. We find that there is a vast difference in the content shared in China when compared to a global social network such as Twitter. In China, the trends are created almost entirely due to the retweets of media content such as jokes, images and videos, unlike Twitter where it has been shown that the trends tend to have more to do with current global events and news stories. We take a detailed look at the formation, persistence and decay of trends and examine the key topics that trend in Sina Weibo. One of our key findings is that retweets are much more common in Sina Weibo and contribute a lot to creating trends. When we look closer, we observe that most trends in Sina Weibo are due to the continuous retweets of a small percentage of fraudulent accounts. These fake accounts are set up to artificially inflate certain posts, causing them to shoot up into Sina Weibos trending list, which are in turn displayed as the most popular topics to users.
Social media sites are information marketplaces, where users produce and consume a wide variety of information and ideas. In these sites, users typically choose their information sources, which in turn determine what specific information they receive, how much information they receive and how quickly this information is shown to them. In this context, a natural question that arises is how efficient are social media users at selecting their information sources. In this work, we propose a computational framework to quantify users efficiency at selecting information sources. Our framework is based on the assumption that the goal of users is to acquire a set of unique pieces of information. To quantify users efficiency, we ask if the user could have acquired the same pieces of information from another set of sources more efficiently. We define three different notions of efficiency -- link, in-flow, and delay -- corresponding to the number of sources the user follows, the amount of (redundant) information she acquires and the delay with which she receives the information. Our definitions of efficiency are general and applicable to any social media system with an underlying information network, in which every user follows others to receive the information they produce. In our experiments, we measure the efficiency of Twitter users at acquiring different types of information. We find that Twitter users exhibit sub-optimal efficiency across the three notions of efficiency, although they tend to be more efficient at acquiring non-popular than popular pieces of information. We then show that this lack of efficiency is a consequence of the triadic closure mechanism by which users typically discover and follow other users in social media. Finally, we develop a heuristic algorithm that enables users to be significantly more efficient at acquiring the same unique pieces of information.
The information collected by mobile phone operators can be considered as the most detailed information on human mobility across a large part of the population. The study of the dynamics of human mobility using the collected geolocations of users, and applying it to predict future users locations, has been an active field of research in recent years. In this work, we study the extent to which social phenomena are reflected in mobile phone data, focusing in particular in the cases of urban commute and major sports events. We illustrate how these events are reflected in the data, and show how information about the events can be used to improve predictability in a simple model for a mobile phone users location.