Do you want to publish a course? Click here

Now, It's Personal : The Need for Personalized Word Sense Disambiguation

الآن، إنه شخصي: الحاجة إلى غموض معنى كلمة الشخصية

254   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

Authors of text tend to predominantly use a single sense for a lemma that can differ among different authors. This might not be captured with an author-agnostic word sense disambiguation (WSD) model that was trained on multiple authors. Our work finds that WordNet's first senses, the predominant senses of our dataset's genre, and the predominant senses of an author can all be different and therefore, author-agnostic models could perform well over the entire dataset, but poorly on individual authors. In this work, we explore methods for personalizing WSD models by tailoring existing state-of-the-art models toward an individual by exploiting the author's sense distributions. We propose a novel WSD dataset and show that personalizing a WSD system with knowledge of an author's sense distributions or predominant senses can greatly increase its performance.



References used
https://aclanthology.org/
rate research

Read More

Supervised systems have nowadays become the standard recipe for Word Sense Disambiguation (WSD), with Transformer-based language models as their primary ingredient. However, while these systems have certainly attained unprecedented performances, virt ually all of them operate under the constraining assumption that, given a context, each word can be disambiguated individually with no account of the other sense choices. To address this limitation and drop this assumption, we propose CONtinuous SEnse Comprehension (ConSeC), a novel approach to WSD: leveraging a recent re-framing of this task as a text extraction problem, we adapt it to our formulation and introduce a feedback loop strategy that allows the disambiguation of a target word to be conditioned not only on its context but also on the explicit senses assigned to nearby words. We evaluate ConSeC and examine how its components lead it to surpass all its competitors and set a new state of the art on English WSD. We also explore how ConSeC fares in the cross-lingual setting, focusing on 8 languages with various degrees of resource availability, and report significant improvements over prior systems. We release our code at https://github.com/SapienzaNLP/consec.
Words are defined based on their meanings in various ways in different resources. Aligning word senses across monolingual lexicographic resources increases domain coverage and enables integration and incorporation of data. In this paper, we explore t he application of classification methods using manually-extracted features along with representation learning techniques in the task of word sense alignment and semantic relationship detection. We demonstrate that the performance of classification methods dramatically varies based on the type of semantic relationships due to the nature of the task but outperforms the previous experiments.
This paper describes our submission to SemEval 2021 Task 2. We compare XLM-RoBERTa Base and Large in the few-shot and zero-shot settings and additionally test the effectiveness of using a k-nearest neighbors classifier in the few-shot setting instead of the more traditional multi-layered perceptron. Our experiments on both the multi-lingual and cross-lingual data show that XLM-RoBERTa Large, unlike the Base version, seems to be able to more effectively transfer learning in a few-shot setting and that the k-nearest neighbors classifier is indeed a more powerful classifier than a multi-layered perceptron when used in few-shot learning.
In parataxis languages like Chinese, word meanings are constructed using specific word-formations, which can help to disambiguate word senses. However, such knowledge is rarely explored in previous word sense disambiguation (WSD) methods. In this pap er, we propose to leverage word-formation knowledge to enhance Chinese WSD. We first construct a large-scale Chinese lexical sample WSD dataset with word-formations. Then, we propose a model FormBERT to explicitly incorporate word-formations into sense disambiguation. To further enhance generalizability, we design a word-formation predictor module in case word-formation annotations are unavailable. Experimental results show that our method brings substantial performance improvement over strong baselines.
In this paper, we describe our proposed methods for the multilingual word-in-Context disambiguation task in SemEval-2021. In this task, systems should determine whether a word that occurs in two different sentences is used with the same meaning or no t. We proposed several methods using a pre-trained BERT model. In two of them, we paraphrased sentences and add them as input to the BERT, and in one of them, we used WordNet to add some extra lexical information. We evaluated our proposed methods on test data in SemEval- 2021 task 2.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا