في هذه الورقة، نقدم نظاما لحل مهمة الغموض في السياق عبر اللغات واللغات متعددة اللغات. قدم منظمو المهام بيانات أحادية الأونلينغ بعدة لغات، ولكن لم تتوفر بيانات تدريبية عبر اللغات. لمعالجة عدم وجود بيانات تدريبية عبر اللغات المقدمة رسميا، قررنا توليد هذه البيانات بأنفسنا. نحن نصف نهج بسيط ولكنه فعال يعتمد على الترجمة الآلية والترجمة الخلفية للوحدات المعجمية إلى اللغة الأصلية المستخدمة في سياق هذه المهمة المشتركة. في تجاربنا، استخدمنا نظاما عصبا يعتمد على نموذج لغة XLM-R، وهو نموذج لغة ملثم مقره محول مسبقا، كناسما أساسيا. نظهر فعالية النهج المقترح لأنه يسمح بتحسين أداء هذا النموذج الأساسي القوي العصبي القوي. بالإضافة إلى ذلك، في هذه الدراسة، نقدم أنواعا متعددة من المصنف المستند إلى XLM-R، وتجربة طرق مختلفة لخلط المعلومات من الأحداث الأولى والثانية للكلمة المستهدفة في عينتين.
In this paper, we present a system for the solution of the cross-lingual and multilingual word-in-context disambiguation task. Task organizers provided monolingual data in several languages, but no cross-lingual training data were available. To address the lack of the officially provided cross-lingual training data, we decided to generate such data ourselves. We describe a simple yet effective approach based on machine translation and back translation of the lexical units to the original language used in the context of this shared task. In our experiments, we used a neural system based on the XLM-R, a pre-trained transformer-based masked language model, as a baseline. We show the effectiveness of the proposed approach as it allows to substantially improve the performance of this strong neural baseline model. In addition, in this study, we present multiple types of the XLM-R based classifier, experimenting with various ways of mixing information from the first and second occurrences of the target word in two samples.
References used
https://aclanthology.org/
This paper presents a word-in-context disambiguation system. The task focuses on capturing the polysemous nature of words in a multilingual and cross-lingual setting, without considering a strict inventory of word meanings. The system applies Natural
In this paper, we introduce the first SemEval task on Multilingual and Cross-Lingual Word-in-Context disambiguation (MCL-WiC). This task allows the largely under-investigated inherent ability of systems to discriminate between word senses within and
We experiment with XLM RoBERTa for Word in Context Disambiguation in the Multi Lingual and Cross Lingual setting so as to develop a single model having knowledge about both settings. We solve the problem as a binary classification problem and also ex
In this paper, we introduce our system that we participated with at the multilingual and cross-lingual word-in-context disambiguation SemEval 2021 shared task. In our experiments, we investigated the possibility of using an all-words fine-grained wor
In this work, we present our approach for solving the SemEval 2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC). The task is a sentence pair classification problem where the goal is to detect whether a given word co