Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Cambridge at SemEval-2021 Task 2: Neural WiC-Model with Data Augmentation and Exploration of Representation

كامبريدج في Semeval-2021 المهمة 2: النموذج WIC العصبي مع زيادة البيانات واستكشاف التمثيل

656 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper describes the system of the Cambridge team submitted to the SemEval-2021 shared task on Multilingual and Cross-lingual Word-in-Context Disambiguation. Building on top of a pre-trained masked language model, our system is first pre-trained on out-of-domain data, and then fine-tuned on in-domain data. We demonstrate the effectiveness of the proposed two-step training strategy and the benefits of data augmentation from both existing examples and new resources. We further investigate different representations and show that the addition of distance-based features is helpful in the word-in-context disambiguation task. Our system yields highly competitive results in the cross-lingual track without training on any cross-lingual data; and achieves state-of-the-art results in the multilingual track, ranking first in two languages (Arabic and Russian) and second in French out of 171 submitted systems.

References used

https://aclanthology.org/

rate research

GX at SemEval-2021 Task 2: BERT with Lemma Information for MCL-WiC Task

921 - Association for Computation Linguistics 2021 مقالة

This paper presents the GX system for the Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC) task. The purpose of the MCL-WiC task is to tackle the challenge of capturing the polysemous nature of words without relying on a fixed sense inventory in a multilingual and cross-lingual setting. To solve the problems, we use context-specific word embeddings from BERT to eliminate the ambiguity between words in different contexts. For languages without an available training corpus, such as Chinese, we use neuron machine translation model to translate the English data released by the organizers to obtain available pseudo-data. In this paper, we apply our system to the English and Chinese multilingual setting and the experimental results show that our method has certain advantages.

lemma information mcl-wic task bert with lemma معلومات ليممة مهمة MCL-WIC بيرت مع ليمما صناعة حمض الفوسفور المزيد..

SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC)

655 - Association for Computation Linguistics 2021 مقالة

In this paper, we introduce the first SemEval task on Multilingual and Cross-Lingual Word-in-Context disambiguation (MCL-WiC). This task allows the largely under-investigated inherent ability of systems to discriminate between word senses within and across languages to be evaluated, dropping the requirement of a fixed sense inventory. Framed as a binary classification, our task is divided into two parts. In the multilingual sub-task, participating systems are required to determine whether two target words, each occurring in a different context within the same language, express the same meaning or not. Instead, in the cross-lingual part, systems are asked to perform the task in a cross-lingual scenario, in which the two target words and their corresponding contexts are provided in two different languages. We illustrate our task, as well as the construction of our manually-created dataset including five languages, namely Arabic, Chinese, English, French and Russian, and the results of the participating systems. Datasets and results are available at: https://github.com/SapienzaNLP/mcl-wic.

تعزيز نموذج التعميم cross-lingual disambiguation عبر اللغات صناعة حمض الفوسفور

NLPIITR at SemEval-2021 Task 6: RoBERTa Model with Data Augmentation for Persuasion Techniques Detection

674 - Association for Computation Linguistics 2021 مقالة

This paper describes and examines different systems to address Task 6 of SemEval-2021: Detection of Persuasion Techniques In Texts And Images, Subtask 1. The task aims to build a model for identifying rhetorical and psycho- logical techniques (such a s causal oversimplification, name-calling, smear) in the textual content of a meme which is often used in a disinformation campaign to influence the users. The paper provides an extensive comparison among various machine learning systems as a solution to the task. We elaborate on the pre-processing of the text data in favor of the task and present ways to overcome the class imbalance. The results show that fine-tuning a RoBERTa model gave the best results with an F1-Micro score of 0.51 on the development set.

persuasion techniques detection augmentation for persuasion تكشف تقنيات الإقناع زيادة للإقناع صناعة حمض الفوسفور

UAlberta at SemEval-2021 Task 2: Determining Sense Synonymy via Translations

777 - Association for Computation Linguistics 2021 مقالة

We describe the University of Alberta systems for the SemEval-2021 Word-in-Context (WiC) disambiguation task. We explore the use of translation information for deciding whether two different tokens of the same word correspond to the same sense of the word. Our focus is on developing principled theoretical approaches which are grounded in linguistic phenomena, leading to more explainable models. We show that translations from multiple languages can be leveraged to improve the accuracy on the WiC task.

determining sense synonymy determining sense sense synonymy تحديد المرادف بالمعنى تحديد المنطقي الشعور مرادف صناعة حمض الفوسفور المزيد..

Cambridge at SemEval-2021 Task 1: An Ensemble of Feature-Based and Neural Models for Lexical Complexity Prediction

537 - Association for Computation Linguistics 2021 مقالة

This paper describes our submission to the SemEval-2021 shared task on Lexical Complexity Prediction. We approached it as a regression problem and present an ensemble combining four systems, one feature-based and three neural with fine-tuning, freque ncy pre-training and multi-task learning, achieving Pearson scores of 0.8264 and 0.7556 on the trial and test sets respectively (sub-task 1). We further present our analysis of the results and discuss our findings.

تعلم نموذج القائم على نموذج مدرب مسبقا صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Cambridge at SemEval-2021 Task 2: Neural WiC-Model with Data Augmentation and Exploration of Representation

كامبريدج في Semeval-2021 المهمة 2: النموذج WIC العصبي مع زيادة البيانات واستكشاف التمثيل

Ask ChatGPT about the research

Read More

suggested questions