Do you want to publish a course? Click here

In this paper we analyze the extent to which contextualized sense embeddings, i.e., sense embeddings that are computed based on contextualized word embeddings, are transferable across languages.To this end, we compiled a unified cross-lingual benchma rk for Word Sense Disambiguation. We then propose two simple strategies to transfer sense-specific knowledge across languages and test them on the benchmark.Experimental results show that this contextualized knowledge can be effectively transferred to similar languages through pre-trained multilingual language models, to the extent that they can out-perform monolingual representations learnednfrom existing language-specific data.
Supervised systems have nowadays become the standard recipe for Word Sense Disambiguation (WSD), with Transformer-based language models as their primary ingredient. However, while these systems have certainly attained unprecedented performances, virt ually all of them operate under the constraining assumption that, given a context, each word can be disambiguated individually with no account of the other sense choices. To address this limitation and drop this assumption, we propose CONtinuous SEnse Comprehension (ConSeC), a novel approach to WSD: leveraging a recent re-framing of this task as a text extraction problem, we adapt it to our formulation and introduce a feedback loop strategy that allows the disambiguation of a target word to be conditioned not only on its context but also on the explicit senses assigned to nearby words. We evaluate ConSeC and examine how its components lead it to surpass all its competitors and set a new state of the art on English WSD. We also explore how ConSeC fares in the cross-lingual setting, focusing on 8 languages with various degrees of resource availability, and report significant improvements over prior systems. We release our code at https://github.com/SapienzaNLP/consec.
Back-translation (BT) has become one of the de facto components in unsupervised neural machine translation (UNMT), and it explicitly makes UNMT have translation ability. However, all the pseudo bi-texts generated by BT are treated equally as clean da ta during optimization without considering the quality diversity, leading to slow convergence and limited translation performance. To address this problem, we propose a curriculum learning method to gradually utilize pseudo bi-texts based on their quality from multiple granularities. Specifically, we first apply crosslingual word embedding to calculate the potential translation difficulty (quality) for the monolingual sentences. Then, the sentences are fed into UNMT from easy to hard batch by batch. Furthermore, considering the quality of sentences/tokens in a particular batch are also diverse, we further adopt the model itself to calculate the fine-grained quality scores, which are served as learning factors to balance the contributions of different parts when computing loss and encourage the UNMT model to focus on pseudo data with higher quality. Experimental results on WMT 14 En-Fr, WMT 14 En-De, WMT 16 En-Ro, and LDC En-Zh translation tasks demonstrate that the proposed method achieves consistent improvements with faster convergence speed.
In parataxis languages like Chinese, word meanings are constructed using specific word-formations, which can help to disambiguate word senses. However, such knowledge is rarely explored in previous word sense disambiguation (WSD) methods. In this pap er, we propose to leverage word-formation knowledge to enhance Chinese WSD. We first construct a large-scale Chinese lexical sample WSD dataset with word-formations. Then, we propose a model FormBERT to explicitly incorporate word-formations into sense disambiguation. To further enhance generalizability, we design a word-formation predictor module in case word-formation annotations are unavailable. Experimental results show that our method brings substantial performance improvement over strong baselines.
This paper introduces a novel approach to learn visually grounded meaning representations of words as low-dimensional node embeddings on an underlying graph hierarchy. The lower level of the hierarchy models modality-specific word representations, co nditioned to another modality, through dedicated but communicating graphs, while the higher level puts these representations together on a single graph to learn a representation jointly from both modalities. The topology of each graph models similarity relations among words, and is estimated jointly with the graph embedding. The assumption underlying this model is that words sharing similar meaning correspond to communities in an underlying graph in a low-dimensional space. We named this model Hierarchical Multi-Modal Similarity Graph Embedding (HM-SGE). Experimental results validate the ability of HM-SGE to simulate human similarity judgments and concept categorization, outperforming the state of the art.
Knowledge graphs are essential for numerous downstream natural language processing applications, but are typically incomplete with many facts missing. This results in research efforts on multi-hop reasoning task, which can be formulated as a search p rocess and current models typically perform short distance reasoning. However, the long-distance reasoning is also vital with the ability to connect the superficially unrelated entities. To the best of our knowledge, there lacks a general framework that approaches multi-hop reasoning in mixed long-short distance reasoning scenarios. We argue that there are two key issues for a general multi-hop reasoning model: i) where to go, and ii) when to stop. Therefore, we propose a general model which resolves the issues with three modules: 1) the local-global knowledge module to estimate the possible paths, 2) the differentiated action dropout module to explore a diverse set of paths, and 3) the adaptive stopping search module to avoid over searching. The comprehensive results on three datasets demonstrate the superiority of our model with significant improvements against baselines in both short and long distance reasoning scenarios.
The field of NLP has made substantial progress in building meaning representations. However, an important aspect of linguistic meaning, social meaning, has been largely overlooked. We introduce the concept of social meaning to NLP and discuss how insights from sociolinguistics can inform work on representation learning in NLP. We also identify key challenges for this new line of research.
This research attempts at focusing on the sing concept by the French philosopher Jacques Derrida and how he tried to present a different critical reading of this concept, busy answering the question and looking for new horizons of this critical rea ding that gose beyond the structural closure. this will be through a reading strategy that enables us to open up to the text and its multiplicities. So this research firstly introduces the concept as done by Derrida then proceeds to discuss the role of reader being one deadly strategy in creating or producing the sing, being one product of the text, and his ability to recall the memory. All of this is to clarify the distance among texts which the textual horizon creates, since its role dose not have to do only with the strength of sing as in difference but also it contains a structural motive.The sign is created once the effect and the one who is put away are traced down and thus creates enterainment among texts. Consequently, reading becomes an aesthetic experience that reconstructs the text anew till we get in this research to accumulative results in a conclusion that presents some of the ideas we ended up with when tacking this issue.
Meaning precision and fulfillment has been the sole aim of any researcher in language, and since meaning is the outcome of grammatical structure in one specific context, that researcher must not prefer one to the other In other words, all of the g rammatical aspects fall in the trap of one controversial relation when it comes to meaning interpretation Therefore, I have tackled the concept of ' situation context ' in my research and the related terminologies and its impact in the field of language , the impact of the religious factor in attracting the old scientists to the significance of grammatical from and context, the outcome of both and to what extent those scientists have relied upon them when interpreting Qur'anic text, lines of poetry only to figure out of them some new rules.
This research tries to address the problem of Ibn Wahab Al-Kateb Types of Rhetorics according to Al-Jahiz and Ibn Wahab Al-Kateb accusing Abi Othman Al-Jahiz of not giving rhetorics what it is worth, and of not studying it thoroughfully. Ibn Wahab claimed completing what is missed through his detailed study of the types of rhetorics, which are, to a large extent, similar to the types of rhetorics according to Abi Othman.The later referred to the importance of the relation between pronunciation, meaning and the necessity of the accordance between the two. In addition, meaning is prior to pronunciation because it depends upon thought and contemplation. He describes the types of rhetorics in a pyramidal sequential through levels that stem-- and result-- from each other. Similarly, Ibn wahab considers that the types of rhetorics result from each other. These types, according to critics, are a process of a birth of these forms. Despite the big similarity of the types of rhetorics according to both Ibn Wahab Al- Kateb and Abi Othman Al-Jahiz, the first was not only a transcriber, but also he added and clarified in certain areas. He exclusively talked about writers and classified them in one of five: transcript writer, pronunciation writer, contract writer, judgment writer and management writer. He also mentioned the most important features that a transcript writer must have and divided writers into three levels. He extended in talking about the types of handwriting and the forms of pens. These issues were dropped by Al-Jahiz when talking about the types of rhetorics.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا