New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Hate Towards the Political Opponent: A Twitter Corpus Study of the 2020 US Elections on the Basis of Offensive Speech and Stance Detection

كراهية نحو الخصم السياسي: دراسة كوربوس تويتر لانتخابات 2020 الأمريكية على أساس كشف الكلام والموقف الهجومية

339 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

twitter corpus study political opponent twitter corpus Twitter Corpus الدراسة الخصم السياسي تويتر كوربوس صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The 2020 US Elections have been, more than ever before, characterized by social media campaigns and mutual accusations. We investigate in this paper if this manifests also in online communication of the supporters of the candidates Biden and Trump, by uttering hateful and offensive communication. We formulate an annotation task, in which we join the tasks of hateful/offensive speech detection and stance detection, and annotate 3000 Tweets from the campaign period, if they express a particular stance towards a candidate. Next to the established classes of favorable and against, we add mixed and neutral stances and also annotate if a candidate is mentioned with- out an opinion expression. Further, we an- notate if the tweet is written in an offensive style. This enables us to analyze if supporters of Joe Biden and the Democratic Party communicate differently than supporters of Donald Trump and the Republican Party. A BERT baseline classifier shows that the detection if somebody is a supporter of a candidate can be performed with high quality (.89 F1 for Trump and .91 F1 for Biden), while the detection that somebody expresses to be against a candidate is more challenging (.79 F1 and .64 F1, respectively). The automatic detection of hate/offensive speech remains challenging (with .53 F1). Our corpus is publicly available and constitutes a novel resource for computational modelling of offensive language under consideration of stances.

References used

https://aclanthology.org/

rate research

Arabic Offensive Language on Twitter: Analysis and Experiments

316 - Association for Computation Linguistics 2021 مقالة

Detecting offensive language on Twitter has many applications ranging from detecting/predicting bullying to measuring polarization. In this paper, we focus on building a large Arabic offensive tweet dataset. We introduce a method for building a datas et that is not biased by topic, dialect, or target. We produce the largest Arabic dataset to date with special tags for vulgarity and hate speech. We thoroughly analyze the dataset to determine which topics, dialects, and gender are most associated with offensive tweets and how Arabic speakers useoffensive language. Lastly, we conduct many experiments to produce strong results (F1 =83.2) on the dataset using SOTA techniques.

language on twitter arabic offensive language اللغة على Twitter اللغة الهجومية العربية صناعة حمض الفوسفور

Synthetic Examples Improve Cross-Target Generalization: A Study on Stance Detection on a Twitter corpus.

458 - Association for Computation Linguistics 2021 مقالة

Cross-target generalization is a known problem in stance detection (SD), where systems tend to perform poorly when exposed to targets unseen during training. Given that data annotation is expensive and time-consuming, finding ways to leverage abundan t unlabeled in-domain data can offer great benefits. In this paper, we apply a weakly supervised framework to enhance cross-target generalization through synthetically annotated data. We focus on Twitter SD and show experimentally that integrating synthetic data is helpful for cross-target generalization, leading to significant improvements in performance, with gains in F1 scores ranging from +3.4 to +5.1.

improve cross-target generalization improve cross-target تحسين التعميم المستهدف تحسين الهدف عبر صناعة حمض الفوسفور

Kawarith: an Arabic Twitter Corpus for Crisis Events

378 - Association for Computation Linguistics 2021 مقالة

Social media (SM) platforms such as Twitter provide large quantities of real-time data that can be leveraged during mass emergencies. Developing tools to support crisis-affected communities requires available datasets, which often do not exist for lo w resource languages. This paper introduces Kawarith a multi-dialect Arabic Twitter corpus for crisis events, comprising more than a million Arabic tweets collected during 22 crises that occurred between 2018 and 2020 and involved several types of hazard. Exploration of this content revealed the most discussed topics and information types, and the paper presents a labelled dataset from seven emergency events that serves as a gold standard for several tasks in crisis informatics research. Using annotated data from the same event, a BERT model is fine-tuned to classify tweets into different categories in the multi- label setting. Results show that BERT-based models yield good performance on this task even with small amounts of task-specific training data.

arabic twitter corpus arabic twitter العربية تويتر كوربوس تويتر عربي صناعة حمض الفوسفور

Adversarial Training for News Stance Detection: Leveraging Signals from a Multi-Genre Corpus.

375 - Association for Computation Linguistics 2021 مقالة

Cross-target generalization constitutes an important issue for news Stance Detection (SD). In this short paper, we investigate adversarial cross-genre SD, where knowledge from annotated user-generated data is leveraged to improve news SD on targets u nseen during training. We implement a BERT-based adversarial network and show experimental performance improvements over a set of strong baselines. Given the abundance of user-generated data, which are considerably less expensive to retrieve and annotate than news articles, this constitutes a promising research direction.

leveraging signals multi-genre corpus الاستفادة من الإشارات متعدد الأنواع صناعة حمض الفوسفور

Zero-shot Cross-lingual Content Filtering: Offensive Language and Hate Speech Detection

357 - Association for Computation Linguistics 2021 مقالة

We present a system for zero-shot cross-lingual offensive language and hate speech classification. The system was trained on English datasets and tested on a task of detecting hate speech and offensive social media content in a number of languages wi thout any additional training. Experiments show an impressive ability of both models to generalize from English to other languages. There is however an expected gap in performance between the tested cross-lingual models and the monolingual models. The best performing model (offensive content classifier) is available online as a REST API.

استخلاص الكلمات الرئيسية العصبية cross-lingual content filtering تصفية المحتوى عبر اللغات صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Hate Towards the Political Opponent: A Twitter Corpus Study of the 2020 US Elections on the Basis of Offensive Speech and Stance Detection

كراهية نحو الخصم السياسي: دراسة كوربوس تويتر لانتخابات 2020 الأمريكية على أساس كشف الكلام والموقف الهجومية

Ask ChatGPT about the research

Read More

suggested questions