Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Homonymy and Polysemy Detection with Multilingual Information

الكشف المثلي الجنسي و Polysemy مع معلومات متعددة اللغات

751 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

polysemy detection hauer and kondrak polysemy كشف polysemy. Hauer و Kondrak. تعدد المعاني صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Deciding whether a semantically ambiguous word is homonymous or polysemous is equivalent to establishing whether it has any pair of senses that are semantically unrelated. We present novel methods for this task that leverage information from multilingual lexical resources. We formally prove the theoretical properties that provide the foundation for our methods. In particular, we show how the One Homonym Per Translation hypothesis of Hauer and Kondrak (2020a) follows from the synset properties formulated by Hauer and Kondrak (2020b). Experimental evaluation shows that our approach sets a new state of the art for homonymy detection.

References used

https://aclanthology.org/

rate research

Discriminating Homonymy from Polysemy in Wordnets: English, Spanish and Polish Nouns

715 - Association for Computation Linguistics 2021 مقالة

We propose a novel method of homonymy-polysemy discrimination for three Indo-European Languages (English, Spanish and Polish). Support vector machines and LASSO logistic regression were successfully used in this task, outperforming baselines. The fea ture set utilised lemma properties, gloss similarities, graph distances and polysemy patterns. The proposed ML models performed equally well for English and the other two languages (constituting testing data sets). The algorithms not only ruled out most cases of homonymy but also were efficacious in distinguishing between closer and indirect semantic relatedness.

spanish and polish polish nouns polysemy in wordnets الإسبانية والبولندية الأسماء البولندية Polysemy في Waldnets. صناعة حمض الفوسفور المزيد..

Patterns of Polysemy and Homonymy in Contextualised Language Models

718 - Association for Computation Linguistics 2021 مقالة

One of the central aspects of contextualised language models is that they should be able to distinguish the meaning of lexically ambiguous words by their contexts. In this paper we investigate the extent to which the contextualised embeddings of word forms that display multiplicity of sense reflect traditional distinctions of polysemy and homonymy. To this end, we introduce an extended, human-annotated dataset of graded word sense similarity and co-predication acceptability, and evaluate how well the similarity of embeddings predicts similarity in meaning. Both types of human judgements indicate that the similarity of polysemic interpretations falls in a continuum between identity of meaning and homonymy. However, we also observe significant differences within the similarity ratings of polysemes, forming consistent patterns for different types of polysemic sense alternation. Our dataset thus appears to capture a substantial part of the complexity of lexical ambiguity, and can provide a realistic test bed for contextualised embeddings. Among the tested models, BERT Large shows the strongest correlation with the collected word sense similarity ratings, but struggles to consistently replicate the observed similarity patterns. When clustering ambiguous word forms based on their embeddings, the model displays high confidence in discerning homonyms and some types of polysemic alternations, but consistently fails for others.

نمذجة اللغة تحت الإشراف على الذات contextualised language اللغة السياقية صناعة حمض الفوسفور

Multilingual and Cross-Lingual Intent Detection from Spoken Data

1257 - Association for Computation Linguistics 2021 مقالة

We present a systematic study on multilingual and cross-lingual intent detection (ID) from spoken data. The study leverages a new resource put forth in this work, termed MInDS-14, a first training and evaluation resource for the ID task with spoken d ata. It covers 14 intents extracted from a commercial system in the e-banking domain, associated with spoken examples in 14 diverse language varieties. Our key results indicate that combining machine translation models with state-of-the-art multilingual sentence encoders (e.g., LaBSE) yield strong intent detectors in the majority of target languages covered in MInDS-14, and offer comparative analyses across different axes: e.g., translation direction, impact of speech recognition, data augmentation from a related domain. We see this work as an important step towards more inclusive development and evaluation of multilingual ID from spoken data, hopefully in a much wider spectrum of languages compared to prior work.

cross-lingual intent detection spoken data الكشف عن النية عبر اللغات البيانات المنطوقة صناعة حمض الفوسفور

Toward Discourse-Aware Models for Multilingual Fake News Detection

800 - Association for Computation Linguistics 2021 مقالة

Statements that are intentionally misstated (or manipulated) are of considerable interest to researchers, government, security, and financial systems. According to deception literature, there are reliable cues for detecting deception and the belief t hat liars give off cues that may indicate their deception is near-universal. Therefore, given that deceiving actions require advanced cognitive development that honesty simply does not require, as well as people's cognitive mechanisms have promising guidance for deception detection, in this Ph.D. ongoing research, we propose to examine discourse structure patterns in multilingual deceptive news corpora using the Rhetorical Structure Theory framework. Considering that our work is the first to exploit multilingual discourse-aware strategies for fake news detection, the research community currently lacks multilingual deceptive annotated corpora. Accordingly, this paper describes the current progress in this thesis, including (i) the construction of the first multilingual deceptive corpus, which was annotated by specialists according to the Rhetorical Structure Theory framework, and (ii) the introduction of two new proposed rhetorical relations: INTERJECTION and IMPERATIVE, which we assume to be relevant for the fake news detection task.

المرشحين للاطلاع على الاختبارات structure theory framework discourse-aware models نظرية الهيكل الخطابي إطار نظرية الهيكل نماذج علم الخطاب صناعة حمض الفوسفور المزيد..

MUDES: Multilingual Detection of Offensive Spans

621 - Association for Computation Linguistics 2021 مقالة

The interest in offensive content identification in social media has grown substantially in recent years. Previous work has dealt mostly with post level annotations. However, identifying offensive spans is useful in many ways. To help coping with thi s important challenge, we present MUDES, a multilingual system to detect offensive spans in texts. MUDES features pre-trained models, a Python API for developers, and a user-friendly web-based interface. A detailed description of MUDES' components is presented in this paper.

multilingual detection offensive spans الكشف متعدد اللغات المدعم الهجومي صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Homonymy and Polysemy Detection with Multilingual Information

الكشف المثلي الجنسي و Polysemy مع معلومات متعددة اللغات

Ask ChatGPT about the research

Read More

suggested questions