Detection of Novel Social Bots by Ensembles of Specialized Classifiers

69 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Onur Varol

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Mohsen Sayyadiharikandeh - Onur Varol - Kai-Cheng Yang

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Malicious actors create inauthentic social media accounts controlled in part by algorithms, known as social bots, to disseminate misinformation and agitate online discussion. While researchers have developed sophisticated methods to detect abuse, novel bots with diverse behaviors evade detection. We show that different types of bots are characterized by different behavioral features. As a result, supervised learning techniques suffer severe performance deterioration when attempting to detect behaviors not observed in the training data. Moreover, tuning these models to recognize novel bots requires retraining with a significant amount of new annotations, which are expensive to obtain. To address these issues, we propose a new supervised learning method that trains classifiers specialized for each class of bots and combines their decisions through the maximum rule. The ensemble of specialized classifiers (ESC) can better generalize, leading to an average improvement of 56% in F1 score for unseen accounts across datasets. Furthermore, novel bot behaviors are learned with fewer labeled examples during retraining. We deployed ESC in the newest version of Botometer, a popular tool to detect social bots in the wild, with a cross-validation AUC of 0.99.

قيم البحث

اقرأ أيضاً

On Profiling Bots in Social Media

107 - Richard Jayadi Oentaryo , Arinto Murdopo , Philips Kokoh Prasetyo 2016

The popularity of social media platforms such as Twitter has led to the proliferation of automated bots, creating both opportunities and challenges in information dissemination, user engagements, and quality of services. Past works on profiling bots had been focused largely on malicious bots, with the assumption that these bots should be removed. In this work, however, we find many bots that are benign, and propose a new, broader categorization of bots based on their behaviors. This includes broadcast, consumption, and spam bots. To facilitate comprehensive analyses of bots and how they compare to human accounts, we develop a systematic profiling framework that includes a rich set of features and classifier bank. We conduct extensive experiments to evaluate the performances of different classifiers under varying time windows, identify the key features of bots, and infer about bots in a larger Twitter population. Our analysis encompasses more than 159K bot and human (non-bot) accounts in Twitter. The results provide interesting insights on the behavioral traits of both benign and malicious bots.

الشبكات الاجتماعية والمعلومات استرجاع المعلومات

The Rise of Social Bots

175 - Emilio Ferrara , Onur Varol , Clayton Davis 2014

The Turing test aimed to recognize the behavior of a human from that of a computer algorithm. Such challenge is more relevant than ever in todays social media context, where limited attention and technology constrain the expressive power of humans, w hile incentives abound to develop software agents mimicking humans. These social bots interact, often unnoticed, with real people in social media ecosystems, but their abundance is uncertain. While many bots are benign, one can design harmful bots with the goals of persuading, smearing, or deceiving. Here we discuss the characteristics of modern, sophisticated social bots, and how their presence can endanger online ecosystems and our society. We then review current efforts to detect social bots on Twitter. Features related to content, network, sentiment, and temporal patterns of activity are imitated by bots but at the same time can help discriminate synthetic behaviors from human ones, yielding signatures of engineered social tampering.

الشبكات الاجتماعية والمعلومات أجهزة الكمبيوتر والمجتمع تحليل البيانات والإحصاءات والاحتمال

The spread of low-credibility content by social bots

101 - Chengcheng Shao , Giovanni Luca Ciampaglia , Onur Varol 2017

The massive spread of digital misinformation has been identified as a major global risk and has been alleged to influence elections and threaten democracies. Communication, cognitive, social, and computer scientists are engaged in efforts to study th e complex causes for the viral diffusion of misinformation online and to develop solutions, while search and social media platforms are beginning to deploy countermeasures. With few exceptions, these efforts have been mainly informed by anecdotal evidence rather than systematic data. Here we analyze 14 million messages spreading 400 thousand articles on Twitter during and following the 2016 U.S. presidential campaign and election. We find evidence that social bots played a disproportionate role in amplifying low-credibility content. Accounts that actively spread articles from low-credibility sources are significantly more likely to be bots. Automated accounts are particularly active in amplifying content in the very early spreading moments, before an article goes viral. Bots also target users with many followers through replies and mentions. Humans are vulnerable to this manipulation, retweeting bots who post links to low-credibility content. Successful low-credibility sources are heavily supported by social bots. These results suggest that curbing social bots may be an effective strategy for mitigating the spread of online misinformation.

الشبكات الاجتماعية والمعلومات أجهزة الكمبيوتر والمجتمع الفيزياء والمجتمع

Social Bots for Online Public Health Interventions

180 - Ashok Deb , Anuja Majmundar , Sungyong Seo 2018

According to the Center for Disease Control and Prevention, in the United States hundreds of thousands initiate smoking each year, and millions live with smoking-related dis- eases. Many tobacco users discuss their habits and preferences on social me dia. This work conceptualizes a framework for targeted health interventions to inform tobacco users about the consequences of tobacco use. We designed a Twitter bot named Notobot (short for No-Tobacco Bot) that leverages machine learning to identify users posting pro-tobacco tweets and select individualized interventions to address their interest in tobacco use. We searched the Twitter feed for tobacco-related keywords and phrases, and trained a convolutional neural network using over 4,000 tweets dichotomously manually labeled as either pro- tobacco or not pro-tobacco. This model achieves a 90% recall rate on the training set and 74% on test data. Users posting pro- tobacco tweets are matched with former smokers with similar interests who posted anti-tobacco tweets. Algorithmic matching, based on the power of peer influence, allows for the systematic delivery of personalized interventions based on real anti-tobacco tweets from former smokers. Experimental evaluation suggests that our system would perform well if deployed. This research offers opportunities for public health researchers to increase health awareness at scale. Future work entails deploying the fully operational Notobot system in a controlled experiment within a public health campaign.

الشبكات الاجتماعية والمعلومات

Neutral bots probe political bias on social media

122 - Wen Chen , Diogo Pacheco , Kai-Cheng Yang 2020

Social media platforms attempting to curb abuse and misinformation have been accused of political bias. We deploy neutral social bots who start following different news sources on Twitter, and track them to probe distinct biases emerging from platfor m mechanisms versus user interactions. We find no strong or consistent evidence of political bias in the news feed. Despite this, the news and information to which U.S. Twitter users are exposed depend strongly on the political leaning of their early connections. The interactions of conservative accounts are skewed toward the right, whereas liberal accounts are exposed to moderate content shifting their experience toward the political center. Partisan accounts, especially conservative ones, tend to receive more followers and follow more automated accounts. Conservative accounts also find themselves in denser communities and are exposed to more low-credibility content.

الشبكات الاجتماعية والمعلومات أجهزة الكمبيوتر والمجتمع