ترغب بنشر مسار تعليمي؟ اضغط هنا

Heterogeneity-aware Twitter Bot Detection with Relational Graph Transformers

421   0   0.0 ( 0 )
 نشر من قبل Shangbin Feng
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Twitter bot detection has become an important and challenging task to combat misinformation and protect the integrity of the online discourse. State-of-the-art approaches generally leverage the topological structure of the Twittersphere, while they neglect the heterogeneity of relations and influence among users. In this paper, we propose a novel bot detection framework to alleviate this problem, which leverages the topological structure of user-formed heterogeneous graphs and models varying influence intensity between users. Specifically, we construct a heterogeneous information network with users as nodes and diversified relations as edges. We then propose relational graph transformers to model heterogeneous influence between users and learn node representations. Finally, we use semantic attention networks to aggregate messages across users and relations and conduct heterogeneity-aware Twitter bot detection. Extensive experiments demonstrate that our proposal outperforms state-of-the-art methods on a comprehensive Twitter bot detection benchmark. Additional studies also bear out the effectiveness of our proposed relational graph transformers, semantic attention networks and the graph-based approach in general.



قيم البحث

اقرأ أيضاً

Twitter users operated by automated programs, also known as bots, have increased their appearance recently and induced undesirable social effects. While extensive research efforts have been devoted to the task of Twitter bot detection, previous metho ds leverage only a small fraction of user semantic and profile information, which leads to their failure in identifying bots that exploit multi-modal user information to disguise as genuine users. Apart from that, the state-of-the-art bot detectors fail to leverage user follow relationships and the graph structure it forms. As a result, these methods fall short of capturing new generations of Twitter bots that act in groups and seem genuine individually. To address these two challenges of Twitter bot detection, we propose BotRGCN, which is short for Bot detection with Relational Graph Convolutional Networks. BotRGCN addresses the challenge of community by constructing a heterogeneous graph from follow relationships and apply relational graph convolutional networks to the Twittersphere. Apart from that, BotRGCN makes use of multi-modal user semantic and property information to avoid feature engineering and augment its ability to capture bots with diversified disguise. Extensive experiments demonstrate that BotRGCN outperforms competitive baselines on a comprehensive benchmark TwiBot-20 which provides follow relationships. BotRGCN is also proved to effectively leverage three modals of user information, namely semantic, property and neighborhood information, to boost bot detection performance.
Twitter has become a vital social media platform while an ample amount of malicious Twitter bots exist and induce undesirable social effects. Successful Twitter bot detection proposals are generally supervised, which rely heavily on large-scale datas ets. However, existing benchmarks generally suffer from low levels of user diversity, limited user information and data scarcity. Therefore, these datasets are not sufficient to train and stably benchmark bot detection measures. To alleviate these problems, we present TwiBot-20, a massive Twitter bot detection benchmark, which contains 229,573 users, 33,488,192 tweets, 8,723,736 user property items and 455,958 follow relationships. TwiBot-20 covers diversified bots and genuine users to better represent the real-world Twittersphere. TwiBot-20 also includes three modals of user information to support both binary classification of single users and community-aware approaches. To the best of our knowledge, TwiBot-20 is the largest Twitter bot detection benchmark to date. We reproduce competitive bot detection methods and conduct a thorough evaluation on TwiBot-20 and two other public datasets. Experiment results demonstrate that existing bot detection measures fail to match their previously claimed performance on TwiBot-20, which suggests that Twitter bot detection remains a challenging task and requires further research efforts.
Twitter is increasingly used for political, advertising and marketing campaigns, where the main aim is to influence users to support specific causes, individuals or groups. We propose a novel methodology for mining and analyzing Twitter campaigns, wh ich includes: (i) collecting tweets and detecting topics relating to a campaign; (ii) mining important campaign topics using scientometrics measures; (iii) modelling user interests using hashtags and topical entropy; (iv) identifying influential users using an adapted PageRank score; and (v) various metrics and visualization techniques for identifying bot-like activities. While this methodology is generalizable to multiple campaign types, we demonstrate its effectiveness on the 2017 German federal election.
Twitter has become a major social media platform since its launching in 2006, while complaints about bot accounts have increased recently. Although extensive research efforts have been made, the state-of-the-art bot detection methods fall short of ge neralizability and adaptability. Specifically, previous bot detectors leverage only a small fraction of user information and are often trained on datasets that only cover few types of bots. As a result, they fail to generalize to real-world scenarios on the Twittersphere where different types of bots co-exist. Additionally, bots in Twitter are constantly evolving to evade detection. Previous efforts, although effective once in their context, fail to adapt to new generations of Twitter bots. To address the two challenges of Twitter bot detection, we propose SATAR, a self-supervised representation learning framework of Twitter users, and apply it to the task of bot detection. In particular, SATAR generalizes by jointly leveraging the semantics, property and neighborhood information of a specific user. Meanwhile, SATAR adapts by pre-training on a massive number of self-supervised users and fine-tuning on detailed bot detection scenarios. Extensive experiments demonstrate that SATAR outperforms competitive baselines on different bot detection datasets of varying information completeness and collection time. SATAR is also proved to generalize in real-world scenarios and adapt to evolving generations of social media bots.
Tweets about everyday events are published on Twitter. Detecting such events is a challenging task due to the diverse and noisy contents of Twitter. In this paper, we propose a novel approach named Weighted Dynamic Heartbeat Graph (WDHG) to detect ev ents from the Twitter stream. Once an event is detected in a Twitter stream, WDHG suppresses it in later stages, in order to detect new emerging events. This unique characteristic makes the proposed approach sensitive to capture emerging events efficiently. Experiments are performed on three real-life benchmark datasets: FA Cup Final 2012, Super Tuesday 2012, and the US Elections 2012. Results show considerable improvement over existing event detection methods in most cases.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا