كانت الانتخابات الأمريكية 2020، أكثر من أي وقت مضى، تتميز بحملات وسائل التواصل الاجتماعي والاتهامات المتبادلة. نحن نحقق في هذه الورقة إذا كان هذا يتجلى أيضا في الاتصالات عبر الإنترنت من مؤيدي المرشحين بايدن وترامب، من خلال نطق التواصل البغيض والهجومي. نقوم بصياغة مهمة توضيحية، نمتلك فيها مهام الكشف عن الكلام والموقف البغيضة / الهجومية، والاحليق على 3000 تغريدات من فترة الحملة، إذا أعربوا عن موقف معين تجاه المرشح. بجانب الطبقات المنشأة المتميزة من مواتية وضد، نقوم بإضافة مواقف مختلطة ومحايدة وأوضح أيضا إذا تم ذكر مرشح تعبير الرأي. علاوة على ذلك، نحن نلاحظ إذا كانت سقسقة مكتوبة بأسلوب مسيء. وهذا يتيح لنا أن نحلل إذا كان مؤيدو جو بايدن والحزب الديمقراطي يتواصلون بشكل مختلف عن أنصار دونالد ترامب والحزب الجمهوري. يوضح مصنف Bert Baseline أن الكشف إذا كان شخص ما مؤيد للمرشح يمكن إجراء جودة عالية ( (.79 F1 و .64 F1، على التوالي). لا يزال الكشف التلقائي لخطاب الكراهية / الهجومية تحديا (مع .53 F1). تتمتع كوربوس لدينا علنا وتشكل مصدرا جديدا للنمذجة الحسابية للغة الهجومية قيد النظر في المواقف.
The 2020 US Elections have been, more than ever before, characterized by social media campaigns and mutual accusations. We investigate in this paper if this manifests also in online communication of the supporters of the candidates Biden and Trump, by uttering hateful and offensive communication. We formulate an annotation task, in which we join the tasks of hateful/offensive speech detection and stance detection, and annotate 3000 Tweets from the campaign period, if they express a particular stance towards a candidate. Next to the established classes of favorable and against, we add mixed and neutral stances and also annotate if a candidate is mentioned with- out an opinion expression. Further, we an- notate if the tweet is written in an offensive style. This enables us to analyze if supporters of Joe Biden and the Democratic Party communicate differently than supporters of Donald Trump and the Republican Party. A BERT baseline classifier shows that the detection if somebody is a supporter of a candidate can be performed with high quality (.89 F1 for Trump and .91 F1 for Biden), while the detection that somebody expresses to be against a candidate is more challenging (.79 F1 and .64 F1, respectively). The automatic detection of hate/offensive speech remains challenging (with .53 F1). Our corpus is publicly available and constitutes a novel resource for computational modelling of offensive language under consideration of stances.
References used
https://aclanthology.org/
Detecting offensive language on Twitter has many applications ranging from detecting/predicting bullying to measuring polarization. In this paper, we focus on building a large Arabic offensive tweet dataset. We introduce a method for building a datas
Cross-target generalization is a known problem in stance detection (SD), where systems tend to perform poorly when exposed to targets unseen during training. Given that data annotation is expensive and time-consuming, finding ways to leverage abundan
Social media (SM) platforms such as Twitter provide large quantities of real-time data that can be leveraged during mass emergencies. Developing tools to support crisis-affected communities requires available datasets, which often do not exist for lo
Cross-target generalization constitutes an important issue for news Stance Detection (SD). In this short paper, we investigate adversarial cross-genre SD, where knowledge from annotated user-generated data is leveraged to improve news SD on targets u
We present a system for zero-shot cross-lingual offensive language and hate speech classification. The system was trained on English datasets and tested on a task of detecting hate speech and offensive social media content in a number of languages wi