No Arabic abstract
The problem of ideology detection is to study the latent (political) placement for people, which is traditionally studied on politicians according to their voting behaviors. Recently, more and more studies begin to address the ideology detection problem for ordinary users based on their online behaviors that can be captured by social media, e.g., Twitter. As far as we are concerned, however, the vast majority of the existing methods on ideology detection on social media have oversimplified the problem as a binary classification problem (i.e., liberal vs. conservative). Moreover, though social links can play a critical role in deciding ones ideology, most of the existing work ignores the heterogeneous types of links in social media. In this paper we propose to detect emph{numerical} ideology positions for Twitter users, according to their emph{follow}, emph{mention}, and emph{retweet} links to a selected set of politicians. A unified probabilistic model is proposed that can (1) explain the reasons why links are built among people in terms of their ideology, (2) integrate heterogeneous types of links together in determining peoples ideology, and (3) automatically learn the quality of each type of links in deciding ones ideology. Experiments have demonstrated the advantages of our model in terms of both ranking and political leaning classification accuracy. It is shown that (1) using multiple types of links is better than using any single type of links alone to determine ones ideology, and (2) our model is even more superior than baselines when dealing with people that are sparsely linked in one type of links. We also show that the detected ideology for Twitter users aligns with our intuition quite well.
Hateful speech in Online Social Networks (OSNs) is a key challenge for companies and governments, as it impacts users and advertisers, and as several countries have strict legislation against the practice. This has motivated work on detecting and characterizing the phenomenon in tweets, social media posts and comments. However, these approaches face several shortcomings due to the noisiness of OSN data, the sparsity of the phenomenon, and the subjectivity of the definition of hate speech. This works presents a user-centric view of hate speech, paving the way for better detection methods and understanding. We collect a Twitter dataset of $100,386$ users along with up to $200$ tweets from their timelines with a random-walk-based crawler on the retweet graph, and select a subsample of $4,972$ to be manually annotated as hateful or not through crowdsourcing. We examine the difference between user activity patterns, the content disseminated between hateful and normal users, and network centrality measurements in the sampled graph. Our results show that hateful users have more recent account creation dates, and more statuses, and followees per day. Additionally, they favorite more tweets, tweet in shorter intervals and are more central in the retweet network, contradicting the lone wolf stereotype often associated with such behavior. Hateful users are more negative, more profane, and use less words associated with topics such as hate, terrorism, violence and anger. We also identify similarities between hateful/normal users and their 1-neighborhood, suggesting strong homophily.
Propagation of political ideologies in social networks has shown a notorious impact on voting behavior. Both the contents of the messages (the ideology) and the politicians influence on their online audiences (their followers) have been associated with such an impact. Here we evaluate which of these factors exerted a major role in deciding electoral results of the 2015 Colombian regional elections by evaluating the linguistic similarity of political ideologies and their influence on the Twitter sphere. The electoral results proved to be strongly associated with tweets and retweets and not with the linguistic content of their ideologies or their Twitter followers. Suggestions on new ways to analyze electoral processes are finally discussed.
The global public sphere has changed dramatically over the past decades: a significant part of public discourse now takes place on algorithmically driven platforms owned by a handful of private companies. Despite its growing importance, there is scant large-scale academic research on the long-term evolution of user behaviour on these platforms, because the data are often proprietary to the platforms. Here, we evaluate the individual behaviour of 600,000 Twitter users between 2012 and 2019 and find empirical evidence for an acceleration of the way Twitter is used on an individual level. This manifests itself in the fact that cohorts of Twitter users behave differently depending on when they joined the platform. Behaviour within a cohort is relatively consistent over time and characterised by strong internal interactions, but over time behaviour from cohort to cohort shifts towards increased activity. Specifically, we measure this in terms of more tweets per user over time, denser interactions with others via retweets, and shorter content horizons, expressed as an individuals decaying autocorrelation of topics over time. Our observations are explained by a growing proportion of active users who not only tweet more actively but also elicit more retweets. These behaviours suggest a collective contribution to an increased flow of information through each cohorts news feed -- an increase that potentially depletes available collective attention over time. Our findings complement recent, empirical work on social acceleration, which has been largely agnostic about individual user activity.
Twitter users operated by automated programs, also known as bots, have increased their appearance recently and induced undesirable social effects. While extensive research efforts have been devoted to the task of Twitter bot detection, previous methods leverage only a small fraction of user semantic and profile information, which leads to their failure in identifying bots that exploit multi-modal user information to disguise as genuine users. Apart from that, the state-of-the-art bot detectors fail to leverage user follow relationships and the graph structure it forms. As a result, these methods fall short of capturing new generations of Twitter bots that act in groups and seem genuine individually. To address these two challenges of Twitter bot detection, we propose BotRGCN, which is short for Bot detection with Relational Graph Convolutional Networks. BotRGCN addresses the challenge of community by constructing a heterogeneous graph from follow relationships and apply relational graph convolutional networks to the Twittersphere. Apart from that, BotRGCN makes use of multi-modal user semantic and property information to avoid feature engineering and augment its ability to capture bots with diversified disguise. Extensive experiments demonstrate that BotRGCN outperforms competitive baselines on a comprehensive benchmark TwiBot-20 which provides follow relationships. BotRGCN is also proved to effectively leverage three modals of user information, namely semantic, property and neighborhood information, to boost bot detection performance.
Twitter bot detection has become an important and challenging task to combat misinformation and protect the integrity of the online discourse. State-of-the-art approaches generally leverage the topological structure of the Twittersphere, while they neglect the heterogeneity of relations and influence among users. In this paper, we propose a novel bot detection framework to alleviate this problem, which leverages the topological structure of user-formed heterogeneous graphs and models varying influence intensity between users. Specifically, we construct a heterogeneous information network with users as nodes and diversified relations as edges. We then propose relational graph transformers to model heterogeneous influence between users and learn node representations. Finally, we use semantic attention networks to aggregate messages across users and relations and conduct heterogeneity-aware Twitter bot detection. Extensive experiments demonstrate that our proposal outperforms state-of-the-art methods on a comprehensive Twitter bot detection benchmark. Additional studies also bear out the effectiveness of our proposed relational graph transformers, semantic attention networks and the graph-based approach in general.