Do you want to publish a course? Click here

Spartans@LT-EDI-EACL2021: Inclusive Speech Detection using Pretrained Language Models

Spartans @ LT-EDI-EACL2021: كشف الكلام الشامل باستخدام نماذج اللغة المحددة مسبقا

254   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

We describe our system that ranked first in Hope Speech Detection (HSD) shared task and fourth in Offensive Language Identification (OLI) shared task, both in Tamil language. The goal of HSD and OLI is to identify if a code-mixed comment or post contains hope speech or offensive content respectively. We pre-train a transformer-based model RoBERTa using synthetically generated code-mixed data and use it in an ensemble along with their pre-trained ULMFiT model available from iNLTK.



References used
https://aclanthology.org/
rate research

Read More

In this paper, we describe our approach towards utilizing pre-trained models for the task of hope speech detection. We participated in Task 2: Hope Speech Detection for Equality, Diversity and Inclusion at LT-EDI-2021 @ EACL2021. The goal of this tas k is to predict the presence of hope speech, along with the presence of samples that do not belong to the same language in the dataset. We describe our approach to fine-tuning RoBERTa for Hope Speech detection in English and our approach to fine-tuning XLM-RoBERTa for Hope Speech detection in Tamil and Malayalam, two low resource Indic languages. We demonstrate the performance of our approach on classifying text into hope-speech, non-hope and not-language. Our approach ranked 1st in English (F1 = 0.93), 1st in Tamil (F1 = 0.61) and 3rd in Malayalam (F1 = 0.83).
This paper aims to describe the approach we used to detect hope speech in the HopeEDI dataset. We experimented with two approaches. In the first approach, we used contextual embeddings to train classifiers using logistic regression, random forest, SV M, and LSTM based models. The second approach involved using a majority voting ensemble of 11 models which were obtained by fine-tuning pre-trained transformer models (BERT, ALBERT, RoBERTa, IndicBERT) after adding an output layer. We found that the second approach was superior for English, Tamil and Malayalam. Our solution got a weighted F1 score of 0.93, 0.75 and 0.49 for English, Malayalam and Tamil respectively. Our solution ranked 1st in English, 8th in Malayalam and 11th in Tamil.
Analysis and deciphering code-mixed data is imperative in academia and industry, in a multilingual country like India, in order to solve problems apropos Natural Language Processing. This paper proposes a bidirectional long short-term memory (BiLSTM) with the attention-based approach, in solving the hope speech detection problem. Using this approach an F1-score of 0.73 (9thrank) in the Malayalam-English data set was achieved from a total of 31 teams who participated in the competition.
In today's society, the rapid development of communication technology allows us to communicate with people from different parts of the world. In the process of communication, each person treats others differently. Some people are used to using offens ive and sarcastic language to express their views. These words cause pain to others and make people feel down. Some people are used to sharing happiness with others and encouraging others. Such people bring joy and hope to others through their words. On social media platforms, these two kinds of language are all over the place. If people want to make the online world a better place, they will have to deal with both. So identifying offensive language and hope language is an essential task. There have been many assignments about offensive language. Shared Task on Hope Speech Detection for Equality, Diversity, and Inclusion at LT-EDI 2021-EACL 2021 uses another unique perspective -- to identify the language of Hope to make contributions to society. The XLM-Roberta model is an excellent multilingual model. Our team used a fine-tuned XLM-Roberta model to accomplish this task.
Hope is an essential aspect of mental health stability and recovery in every individual in this fast-changing world. Any tools and methods developed for detection, analysis, and generation of hope speech will be beneficial. In this paper, we propose a model on hope-speech detection to automatically detect web content that may play a positive role in diffusing hostility on social media. We perform the experiments by taking advantage of pre-processing and transfer-learning models. We observed that the pre-trained multilingual-BERT model with convolution neural networks gave the best results. Our model ranked first, third, and fourth ranks on English, Malayalam-English, and Tamil-English code-mixed datasets.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا