نظرا لأن النهج القائم على المعجم هو أكثر أناقة علميا، أوضح مكونات الحل وأسهل التعميم إلى التطبيقات الأخرى، توفر هذه الورقة نهجا جديدا للغة الهجومية والكشف عن الكلام على وسائل التواصل الاجتماعي، والتي تجسد معجم من الهجوم الضمني والبريثوإقتصار التعبيرات المشروح مع المعلومات السياقية.نظرا لشدة تعليقات وسائل التواصل الاجتماعي المسيئة في البرازيل، وعدم وجود أبحاث باللغة البرتغالية والبرتغالية البرازيلية هي اللغة المستخدمة للتحقق من صحة النماذج.ومع ذلك، قد يتم تطبيق طريقتنا على أي لغة أخرى.تظهر التجارب التي أجراها فعالية النهج المقترح، مما يتفوق على الأساليب الأساسية الحالية للغة البرتغالية.
Since a lexicon-based approach is more elegant scientifically, explaining the solution components and being easier to generalize to other applications, this paper provides a new approach for offensive language and hate speech detection on social media, which embodies a lexicon of implicit and explicit offensive and swearing expressions annotated with contextual information. Due to the severity of the social media abusive comments in Brazil, and the lack of research in Portuguese, Brazilian Portuguese is the language used to validate the models. Nevertheless, our method may be applied to any other language. The conducted experiments show the effectiveness of the proposed approach, outperforming the current baseline methods for the Portuguese language.
References used
https://aclanthology.org/
We introduce HateBERT, a re-trained BERT model for abusive language detection in English. The model was trained on RAL-E, a large-scale dataset of Reddit comments in English from communities banned for being offensive, abusive, or hateful that we hav
The state-of-the-art abusive language detection models report great in-corpus performance, but underperform when evaluated on abusive comments that differ from the training scenario. As human annotation involves substantial time and effort, models th
Current abusive language detection systems have demonstrated unintended bias towards sensitive features such as nationality or gender. This is a crucial issue, which may harm minorities and underrepresented groups if such systems were integrated in r
The use of attention mechanisms in deep learning approaches has become popular in natural language processing due to its outstanding performance. The use of these mechanisms allows one managing the importance of the elements of a sequence in accordan
Abusive language detection has become an important tool for the cultivation of safe online platforms. We investigate the interaction of annotation quality and classifier performance. We use a new, fine-grained annotation scheme that allows us to dist