نقترح طريقة رواية من التمييز المجنس - Polysemy لثلاثة لغات الهند الهندية (الإنجليزية والإسبانية والبولندية).تم استخدام آلات Vector Support واستخدام الانحدار اللوجستي لاسو بنجاح في هذه المهمة، مما يتفوق على الأساس.تم استخدام مجموعة الميزات خصائص Lemma وأشابه لمعان ومسافات الرسم البياني وأنماط Polysemy.تؤدي نماذج ML المقترحة بشكل جيد على قدم المساواة باللغة الإنجليزية واللغتين الأخرى (تشكل مجموعات بيانات الاختبار).لا استبعدت الخوارزميات معظم حالات شنيعها فحسب، بل كانت أيضا فعالة في التمييز بين الدوران الدلالي الوثيق وغير المباشر.
We propose a novel method of homonymy-polysemy discrimination for three Indo-European Languages (English, Spanish and Polish). Support vector machines and LASSO logistic regression were successfully used in this task, outperforming baselines. The feature set utilised lemma properties, gloss similarities, graph distances and polysemy patterns. The proposed ML models performed equally well for English and the other two languages (constituting testing data sets). The algorithms not only ruled out most cases of homonymy but also were efficacious in distinguishing between closer and indirect semantic relatedness.
References used
https://aclanthology.org/
Deciding whether a semantically ambiguous word is homonymous or polysemous is equivalent to establishing whether it has any pair of senses that are semantically unrelated. We present novel methods for this task that leverage information from multilin
One of the central aspects of contextualised language models is that they should be able to distinguish the meaning of lexically ambiguous words by their contexts. In this paper we investigate the extent to which the contextualised embeddings of word
Punctuation restoration is a fundamental requirement for the readability of text derived from Automatic Speech Recognition (ASR) systems. Most contemporary solutions are limited to predicting only a few of the most frequently occurring marks, such as
Despite recent advances in semantic role labeling propelled by pre-trained text encoders like BERT, performance lags behind when applied to predicates observed infrequently during training or to sentences in new domains. In this work, we investigate
In this paper we compare Oxford Lexico and Merriam Webster dictionaries with Princeton WordNet with respect to the description of semantic (dis)similarity between polysemous and homonymous senses that could be inferred from them. WordNet lacks any ex