عندما ينتشر خطاب الكراهية على وسائل التواصل الاجتماعي والمجتمعات عبر الإنترنت، يستمر البحث في العمل على الكشف التلقائي.في الآونة الأخيرة، كان أداء الاعتراف يتزايد بفضل التقدم في التعلم العميق وإدماج ميزات المستخدم.يحقق هذا العمل في الآثار التي يمكن أن تحتوي هذه الميزات على نموذج للكشف.على عكس البحث السابق، نظهر أن مقارنة الأداء البسيطة لا تعرض التأثير الكامل لضمان معلومات السياق والمستخدمية.من خلال الاستفادة من تقنيات الشرط، نعرض (1) يلعب ميزات المستخدم دورا في قرار النموذج و (2) كيف تؤثر على مساحة الميزة المستفادة من النموذج.إلى جانب الكشف عن ذلك --- وتوضيح أيضا لماذا --- ميزات المستخدم هي سبب مكاسب الأداء، نوضح كيف يمكن دمج هذه التقنيات إلى فهم النموذج بشكل أفضل والكشف عن التحيز غير المقصود.
As hate speech spreads on social media and online communities, research continues to work on its automatic detection. Recently, recognition performance has been increasing thanks to advances in deep learning and the integration of user features. This work investigates the effects that such features can have on a detection model. Unlike previous research, we show that simple performance comparison does not expose the full impact of including contextual- and user information. By leveraging explainability techniques, we show (1) that user features play a role in the model's decision and (2) how they affect the feature space learned by the model. Besides revealing that---and also illustrating why---user features are the reason for performance gains, we show how such techniques can be combined to better understand the model and to detect unintended bias.
References used
https://aclanthology.org/
Sensitivity of deep-neural models to input noise is known to be a challenging problem. In NLP, model performance often deteriorates with naturally occurring noise, such as spelling errors. To mitigate this issue, models may leverage artificially nois
Neural generative dialogue agents have shown an increasing ability to hold short chitchat conversations, when evaluated by crowdworkers in controlled settings. However, their performance in real-life deployment -- talking to intrinsically-motivated u
Bias mitigation approaches reduce models' dependence on sensitive features of data, such as social group tokens (SGTs), resulting in equal predictions across the sensitive features. In hate speech detection, however, equalizing model predictions may
We address the task of automatic hate speech detection for low-resource languages. Rather than collecting and annotating new hate speech data, we show how to use cross-lingual transfer learning to leverage already existing data from higher-resource l
How do people understand the meaning of the word small'' when used to describe a mosquito, a church, or a planet? While humans have a remarkable ability to form meanings by combining existing concepts, modeling this process is challenging. This paper