Do you want to publish a course? Click here

CFILT IIT Bombay@LT-EDI-EACL2021: Hope Speech Detection for Equality, Diversity, and Inclusion using Multilingual Representation fromTransformers

CFILT IIT Bombay @ LT-EDI-EACL 2021: الكشف عن الكلام للأمل للمساواة والتنوع والإدماج باستخدام تمثيل متعدد اللغات من المحولات

302   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

With the internet becoming part and parcel of our lives, engagement in social media has increased a lot. Identifying and eliminating offensive content from social media has become of utmost priority to prevent any kind of violence. However, detecting encouraging, supportive and positive content is equally important to prevent misuse of censorship targeted to attack freedom of speech. This paper presents our system for the shared task Hope Speech Detection for Equality, Diversity, and Inclusion at LT-EDI, EACL 2021. The data for this shared task is provided in English, Tamil, and Malayalam which was collected from YouTube comments. It is a multiclass classification problem where each data instance is categorized into one of the three classes: Hope speech', Not hope speech', and Not in intended language'. We propose a system that employs multilingual transformer models to obtain the representation of text and classifies it into one of the three classes. We explored the use of multilingual models trained specifically for Indian languages along with generic multilingual models. Our system was ranked 2nd for English, 2nd for Malayalam, and 7th for the Tamil language in the final leader board published by organizers and obtained a weighted F1-score of 0.92, 0.84, 0.55 respectively on the hidden test dataset used for the competition. We have made our system publicly available at GitHub.



References used
https://aclanthology.org/
rate research

Read More

This paper describes the IIITK's team submissions to the hope speech detection for equality, diversity and inclusion in Dravidian languages shared task organized by LT-EDI 2021 workshop@EACL 2021. Our best configurations for the shared tasks achieve weighted F1 scores of 0.60 for Tamil, 0.83 for Malayalam, and 0.93 for English. We have secured ranks of 4, 3, 2 in Tamil, Malayalam and English respectively.
Analysis and deciphering code-mixed data is imperative in academia and industry, in a multilingual country like India, in order to solve problems apropos Natural Language Processing. This paper proposes a bidirectional long short-term memory (BiLSTM) with the attention-based approach, in solving the hope speech detection problem. Using this approach an F1-score of 0.73 (9thrank) in the Malayalam-English data set was achieved from a total of 31 teams who participated in the competition.
Hope is considered significant for the well-being, recuperation and restoration of human life by health professionals. Hope speech reflects the belief that one can discover pathways to their desired objectives and become roused to utilise those pathw ays. To encourage research in natural language processing towards positive reinforcement approach, we created a hope speech detection dataset. This paper reports on the shared task of hope speech detection for Tamil, English, and Malayalam languages. The shared task was conducted as a part of the EACL 2021 workshop on Language Technology for Equality, Diversity, and Inclusion (LT-EDI-2021). We summarize here the datasets for this challenge which are openly available at https://competitions.codalab.org/competitions/27653, and present an overview of the methods and the results of the competing systems. To the best of our knowledge, this is the first shared task to conduct hope speech detection.
In this paper we work with a hope speech detection corpora that includes English, Tamil, and Malayalam datasets. We present a two phase mechanism to detect hope speech. In the first phase we build a classifier to identify the language of the text. In the second phase, we build a classifier to detect hope speech, non hope speech, or not lang labels. Experimental results show that hope speech detection is challenging and there is scope for improvement.
In this paper, we describe our approach towards utilizing pre-trained models for the task of hope speech detection. We participated in Task 2: Hope Speech Detection for Equality, Diversity and Inclusion at LT-EDI-2021 @ EACL2021. The goal of this tas k is to predict the presence of hope speech, along with the presence of samples that do not belong to the same language in the dataset. We describe our approach to fine-tuning RoBERTa for Hope Speech detection in English and our approach to fine-tuning XLM-RoBERTa for Hope Speech detection in Tamil and Malayalam, two low resource Indic languages. We demonstrate the performance of our approach on classifying text into hope-speech, non-hope and not-language. Our approach ranked 1st in English (F1 = 0.93), 1st in Tamil (F1 = 0.61) and 3rd in Malayalam (F1 = 0.83).

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا