نحن تصف تقديم UPPSALA NLP إلى مهمة SEMEVAL-2021 2 على الغمزة متعددة اللغات والتبلغة في السياق.نستكشف عن فائدة ثلاثة نماذج لغوية متعددة اللغات المدربة مسبقا، XLM-Roberta (XLMR)، بيرت متعددة اللغات (MBERT) بيرت مقطورة متعددة اللغات (Mdistilbert).قارنا هذه النماذج الثلاثة في اثنين من الإعدادات، والضبط بشكل جيد وكمسافات ميزة.في الحالة الثانية، نقوم أيضا بتجربة استخدام المعلومات المستندة إلى التبعية.نجد أن الضبط الدقيق أفضل من استخراج الميزات.يعمل XLMR بشكل أفضل من mbert في الإعداد المتبادل على حد سواء مع ضبط الدقيقة والميزة، في حين أن هاتين النموذجين تعطي أداء مماثل في الإعداد متعدد اللغات.يعمل Mdistilbert بشكل سيئ مع ضبط جيد ولكن يعطي نتائج مماثلة للنماذج الأخرى عند استخدامها كمستغل ميزة.قدمنا أفضل أنظمةنا، يتم ضبطها بشكل جيد مع XLMR و Mbert.
We describe the Uppsala NLP submission to SemEval-2021 Task 2 on multilingual and cross-lingual word-in-context disambiguation. We explore the usefulness of three pre-trained multilingual language models, XLM-RoBERTa (XLMR), Multilingual BERT (mBERT) and multilingual distilled BERT (mDistilBERT). We compare these three models in two setups, fine-tuning and as feature extractors. In the second case we also experiment with using dependency-based information. We find that fine-tuning is better than feature extraction. XLMR performs better than mBERT in the cross-lingual setting both with fine-tuning and feature extraction, whereas these two models give a similar performance in the multilingual setting. mDistilBERT performs poorly with fine-tuning but gives similar results to the other models when used as a feature extractor. We submitted our two best systems, fine-tuned with XLMR and mBERT.
References used
https://aclanthology.org/
In this paper, we introduce the first SemEval task on Multilingual and Cross-Lingual Word-in-Context disambiguation (MCL-WiC). This task allows the largely under-investigated inherent ability of systems to discriminate between word senses within and
This paper presents a word-in-context disambiguation system. The task focuses on capturing the polysemous nature of words in a multilingual and cross-lingual setting, without considering a strict inventory of word meanings. The system applies Natural
In this work, we present our approach for solving the SemEval 2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC). The task is a sentence pair classification problem where the goal is to detect whether a given word co
We experiment with XLM RoBERTa for Word in Context Disambiguation in the Multi Lingual and Cross Lingual setting so as to develop a single model having knowledge about both settings. We solve the problem as a binary classification problem and also ex
Identifying whether a word carries the same meaning or different meaning in two contexts is an important research area in natural language processing which plays a significant role in many applications such as question answering, document summarisati