يتم تعريف الكلمات بناء على معانيها بطرق مختلفة في موارد مختلفة.يزيد محاذاة حواس الكلمات عبر الموارد المعجمية أحادية العمل، مما يزيد من تغطية المجال وتمكن تكامل البيانات وإدماجها.في هذه الورقة، نستكشف تطبيق أساليب التصنيف باستخدام الميزات المستخرجة يدويا جنبا إلى جنب مع تقنيات تعليم التمثيل في مهمة محاذاة معنى النصوص والكشف عن العلاقة الدلالية.نوضح أن أداء أساليب التصنيف يختلف بشكل كبير بناء على نوع العلاقات الدلالية بسبب طبيعة المهمة ولكنه يتفوق على التجارب السابقة.
Words are defined based on their meanings in various ways in different resources. Aligning word senses across monolingual lexicographic resources increases domain coverage and enables integration and incorporation of data. In this paper, we explore the application of classification methods using manually-extracted features along with representation learning techniques in the task of word sense alignment and semantic relationship detection. We demonstrate that the performance of classification methods dramatically varies based on the type of semantic relationships due to the nature of the task but outperforms the previous experiments.
References used
https://aclanthology.org/
Supervised systems have nowadays become the standard recipe for Word Sense Disambiguation (WSD), with Transformer-based language models as their primary ingredient. However, while these systems have certainly attained unprecedented performances, virt
Authors of text tend to predominantly use a single sense for a lemma that can differ among different authors. This might not be captured with an author-agnostic word sense disambiguation (WSD) model that was trained on multiple authors. Our work find
Being able to generate accurate word alignments is useful for a variety of tasks. While statistical word aligners can work well, especially when parallel training data are plentiful, multilingual embedding models have recently been shown to give good
Supervised approaches usually achieve the best performance in the Word Sense Disambiguation problem. However, the unavailability of large sense annotated corpora for many low-resource languages make these approaches inapplicable for them in practice.
Text simplification is a growing field with many potential useful applications. Training text simplification algorithms generally requires a lot of annotated data, however there are not many corpora suitable for this task. We propose a new unsupervis