نقدم نظاما للصفر بالرصاص لغة هجومية عبر اللغات وتصنيف الكلام الكراهية.تم تدريب النظام على مجموعات البيانات الإنجليزية واختباره في مهمة اكتشاف محتوى خطاب الكراهية والوسائط الاجتماعية الهجومية في عدد من اللغات دون أي تدريب إضافي.تظهر التجارب قدرة رائعة لكلا النموذجين للتعميم من اللغة الإنجليزية إلى لغات أخرى.ومع ذلك، هناك فجوة متوقعة في الأداء بين النماذج التي تم اختبارها عبر اللغات والنماذج الأولية.يتوفر أفضل نموذج أداء (مصنف المحتوى الهجومي) عبر الإنترنت ك api بقية.
We present a system for zero-shot cross-lingual offensive language and hate speech classification. The system was trained on English datasets and tested on a task of detecting hate speech and offensive social media content in a number of languages without any additional training. Experiments show an impressive ability of both models to generalize from English to other languages. There is however an expected gap in performance between the tested cross-lingual models and the monolingual models. The best performing model (offensive content classifier) is available online as a REST API.
References used
https://aclanthology.org/
We address the task of automatic hate speech detection for low-resource languages. Rather than collecting and annotating new hate speech data, we show how to use cross-lingual transfer learning to leverage already existing data from higher-resource l
Data filtering for machine translation (MT) describes the task of selecting a subset of a given, possibly noisy corpus with the aim to maximize the performance of an MT system trained on this selected data. Over the years, many different filtering ap
Hate speech detection is an actively growing field of research with a variety of recently proposed approaches that allowed to push the state-of-the-art results. One of the challenges of such automated approaches -- namely recent deep learning models
This paper studies zero-shot cross-lingual transfer of vision-language models. Specifically, we focus on multilingual text-to-video search and propose a Transformer-based model that learns contextual multilingual multimodal embeddings. Under a zero-s
In this paper, we describe experiments designed to evaluate the impact of stylometric and emotion-based features on hate speech detection: the task of classifying textual content into hate or non-hate speech classes. Our experiments are conducted for