أصبحت الأنظمة الخاضعة للإشراف في الوقت الحاضر وصفة قياسية ل disambiguation شعور النصوص (WSD)، مع طرازات اللغة القائمة على المحولات كعنصرها الأساسي. ومع ذلك، في حين أن هذه الأنظمة قد تحققت بالتأكيد عروض غير مسبوقة، فإن جميعها تعمل تقريبا في ظل افتراض التقييد، بالنظر إلى سياق، يمكن إزالة كل كلمة بشكل فردي دون أي حساب من الخيارات الأخرى بالمعنى. لمعالجة هذا القيد وإسقاط هذا الافتراض، نقترح الفهم المعنى المستمر (CONSEC)، ونهج جديد في WSD: الاستفادة من إعادة تأكيد مؤخرا لهذه المهمة كمحالة استخراج النص، نحن نتكيافقها على صياغةنا وإدخال حلقة ردود الفعل الاستراتيجية التي تسمح بالغزانة للكلمة المستهدفة لا تتضمن فقط في سياقها ولكن أيضا على الحواس الصريحة المخصصة للكلمات القريبة. نقيم Consec وفحص كيف تقود مكوناتها إلى تجاوز جميع منافسيها وتحديد حالة من الفن الجديد على WSD الإنجليزية. نستكشف أيضا كيفية فرايس Consec في الإعداد المتبادل اللغوي، مع التركيز على 8 لغات مع درجات مختلفة من توفر الموارد، وإبلاغ تحسينات كبيرة على النظم السابقة. نطلق سردنا في https://github.com/sapienzanlp/consec.
Supervised systems have nowadays become the standard recipe for Word Sense Disambiguation (WSD), with Transformer-based language models as their primary ingredient. However, while these systems have certainly attained unprecedented performances, virtually all of them operate under the constraining assumption that, given a context, each word can be disambiguated individually with no account of the other sense choices. To address this limitation and drop this assumption, we propose CONtinuous SEnse Comprehension (ConSeC), a novel approach to WSD: leveraging a recent re-framing of this task as a text extraction problem, we adapt it to our formulation and introduce a feedback loop strategy that allows the disambiguation of a target word to be conditioned not only on its context but also on the explicit senses assigned to nearby words. We evaluate ConSeC and examine how its components lead it to surpass all its competitors and set a new state of the art on English WSD. We also explore how ConSeC fares in the cross-lingual setting, focusing on 8 languages with various degrees of resource availability, and report significant improvements over prior systems. We release our code at https://github.com/SapienzaNLP/consec.
References used
https://aclanthology.org/
Authors of text tend to predominantly use a single sense for a lemma that can differ among different authors. This might not be captured with an author-agnostic word sense disambiguation (WSD) model that was trained on multiple authors. Our work find
Words are defined based on their meanings in various ways in different resources. Aligning word senses across monolingual lexicographic resources increases domain coverage and enables integration and incorporation of data. In this paper, we explore t
This paper describes our submission to SemEval 2021 Task 2. We compare XLM-RoBERTa Base and Large in the few-shot and zero-shot settings and additionally test the effectiveness of using a k-nearest neighbors classifier in the few-shot setting instead
In parataxis languages like Chinese, word meanings are constructed using specific word-formations, which can help to disambiguate word senses. However, such knowledge is rarely explored in previous word sense disambiguation (WSD) methods. In this pap
In this paper, we describe our proposed methods for the multilingual word-in-Context disambiguation task in SemEval-2021. In this task, systems should determine whether a word that occurs in two different sentences is used with the same meaning or no