تهدف هذه الورقة إلى وصف النهج الذي اعتدنا عليه اكتشاف خطاب الأمل في مجموعة بيانات Hopeiedi.جربنا مع نهجين.في النهج الأول، استخدمنا ادارة سياقية لتدريب المصنفات باستخدام الانحدار اللوجستي، والغابات العشوائية، و SVM، و LSTM.النهج الثاني المعني باستخدام فرقة التصويت للأغلبية من 11 نماذج تم الحصول عليها عن طريق نماذج محولات محول مدربة مسبقا (بيرت، ألبرت، روبرتا، Inderbert) بعد إضافة طبقة إخراج.وجدنا أن النهج الثاني كان متفوقا على اللغة الإنجليزية والتاميل والمالايالامية.حصل حلنا على درجة مرجحة F1 من 0.93 و 0.75 و 0.49 للغة الإنجليزية ومالايالامية والتاميل على التوالي.احتل محلولنا في المرتبة الأولى باللغة الإنجليزية، الثامن في ملايال و 11 في التاميل.
This paper aims to describe the approach we used to detect hope speech in the HopeEDI dataset. We experimented with two approaches. In the first approach, we used contextual embeddings to train classifiers using logistic regression, random forest, SVM, and LSTM based models. The second approach involved using a majority voting ensemble of 11 models which were obtained by fine-tuning pre-trained transformer models (BERT, ALBERT, RoBERTa, IndicBERT) after adding an output layer. We found that the second approach was superior for English, Tamil and Malayalam. Our solution got a weighted F1 score of 0.93, 0.75 and 0.49 for English, Malayalam and Tamil respectively. Our solution ranked 1st in English, 8th in Malayalam and 11th in Tamil.
References used
https://aclanthology.org/
In this paper, we describe our approach towards utilizing pre-trained models for the task of hope speech detection. We participated in Task 2: Hope Speech Detection for Equality, Diversity and Inclusion at LT-EDI-2021 @ EACL2021. The goal of this tas
Analysis and deciphering code-mixed data is imperative in academia and industry, in a multilingual country like India, in order to solve problems apropos Natural Language Processing. This paper proposes a bidirectional long short-term memory (BiLSTM)
In a world with serious challenges like climate change, religious and political conflicts, global pandemics, terrorism, and racial discrimination, an internet full of hate speech, abusive and offensive content is the last thing we desire for. In this
In this paper we work with a hope speech detection corpora that includes English, Tamil, and Malayalam datasets. We present a two phase mechanism to detect hope speech. In the first phase we build a classifier to identify the language of the text. In
This paper mainly introduces the relevant content of the task Hope Speech Detection for Equality, Diversity, and Inclusion at LT-EDI 2021-EACL 2021''. A total of three language datasets were provided, and we chose the English dataset to complete this