نقدم مجموعة بيانات تتكون من مقالات إخبارية ألمانية تسمى التحيز السياسي على مقياس من خمس نقاط في طريقة نصف إشراف.في حين أن العمل المبكر على الكشف عن الأخبار Hyperpartisan يستخدم التصنيف الثنائي (أي Hyperpartisan أو لا) وبيانات اللغة الإنجليزية، فإننا نقول للحصول على تصنيف أكثر غرامة، تغطي الطيف السياسي الكامل (أي بعيدا، اليسار، اليسار، المركز، اليمين، بعيدا- رايت) ولمنس البحث إلى البيانات الألمانية.فهم التحيز السياسي يساعد بدقة في اكتشاف خطاب الكراهية وإساءة الاستخدام عبر الإنترنت.نقوم بتجربة أساليب تصنيف مختلفة للكشف عن التحيز السياسي.تؤكد أدائها المنخفض نسبيا (ماكرو-F1 من 43 من أفضل إعداد لدينا، مقارنة ب Macro-F1 من 79 بمهمة التصنيف الثنائية) إلى الحاجة إلى بيانات أكثر (متوازنة) المشروح بطريقة محترمة بشكل جيد.
We present a data set consisting of German news articles labeled for political bias on a five-point scale in a semi-supervised way. While earlier work on hyperpartisan news detection uses binary classification (i.e., hyperpartisan or not) and English data, we argue for a more fine-grained classification, covering the full political spectrum (i.e., far-left, left, centre, right, far-right) and for extending research to German data. Understanding political bias helps in accurately detecting hate speech and online abuse. We experiment with different classification methods for political bias detection. Their comparatively low performance (a macro-F1 of 43 for our best setup, compared to a macro-F1 of 79 for the binary classification task) underlines the need for more (balanced) data annotated in a fine-grained way.
References used
https://aclanthology.org/
As the world continues to fight the COVID-19 pandemic, it is simultaneously fighting an infodemic' -- a flood of disinformation and spread of conspiracy theories leading to health threats and the division of society. To combat this infodemic, there i
Media bias is a predominant phenomenon present in most forms of print and electronic media such as news articles, blogs, tweets, etc. Since media plays a pivotal role in shaping public opinion towards political happenings, both political parties and
The widespread use of the Internet and the rapid dissemination of information poses the challenge of identifying the veracity of its content. Stance detection, which is the task of predicting the position of a text in regard to a specific target (e.g
Adjectives such as heavy (as in heavy rain) and windy (as in windy day) provide possible values for the attributes intensity and climate, respectively. The attributes themselves are not overtly realized and are in this sense implicit. While these att
Dialogue summarization has drawn much attention recently. Especially in the customer service domain, agents could use dialogue summaries to help boost their works by quickly knowing customer's issues and service progress. These applications require s