New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Definition Modelling for Appropriate Specificity

تحديد النمذجة للحصول على الخصوصية المناسبة

354 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

التحيزات طريقة صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Definition generation techniques aim to generate a definition of a target word or phrase given a context. In previous studies, researchers have faced various issues such as the out-of-vocabulary problem and over/under-specificity problems. Over-specific definitions present narrow word meanings, whereas under-specific definitions present general and context-insensitive meanings. Herein, we propose a method for definition generation with appropriate specificity. The proposed method addresses the aforementioned problems by leveraging a pre-trained encoder-decoder model, namely Text-to-Text Transfer Transformer, and introducing a re-ranking mechanism to model specificity in definitions. Experimental results on standard evaluation datasets indicate that our method significantly outperforms the previous state-of-the-art method. Moreover, manual evaluation confirms that our method effectively addresses the over/under-specificity problems.

References used

https://aclanthology.org/

rate research

Cultural Topic Modelling over Novel Wikipedia Corpora for South-Slavic Languages

328 - Association for Computation Linguistics 2021 مقالة

There is a shortage of high-quality corpora for South-Slavic languages. Such corpora are useful to computer scientists and researchers in social sciences and humanities alike, focusing on numerous linguistic, content analysis, and natural language pr ocessing applications. This paper presents a workflow for mining Wikipedia content and processing it into linguistically-processed corpora, applied on the Bosnian, Bulgarian, Croatian, Macedonian, Serbian, Serbo-Croatian and Slovenian Wikipedia. We make the resulting seven corpora publicly available. We showcase these corpora by comparing the content of the underlying Wikipedias, our assumption being that the content of the Wikipedias reflects broadly the interests in various topics in these Balkan nations. We perform the content comparison by using topic modelling algorithms and various distribution comparisons. The results show that all Wikipedias are topically rather similar, with all of them covering art, culture, and literature, whereas they contain differences in geography, politics, history and science.

south-slavic languages cultural topic modelling corpora لغات جنوب سلافية نمذجة الموضوع الثقافي سورانيا صناعة حمض الفوسفور المزيد..

Cross-Lingual Wolastoqey-English Definition Modelling

244 - Association for Computation Linguistics 2021 مقالة

Definition modelling is the task of automatically generating a dictionary-style definition given a target word. In this paper, we consider cross-lingual definition generation. Specifically, we generate English definitions for Wolastoqey (Malecite-Pas samaquoddy) words. Wolastoqey is an endangered, low-resource polysynthetic language. We hypothesize that sub-word representations based on byte pair encoding (Sennrich et al., 2016) can be leveraged to represent morphologically-complex Wolastoqey words and overcome the challenge of not having large corpora available for training. Our experimental results demonstrate that this approach outperforms baseline methods in terms of BLEU score.

wolastoqey-english definition modelling definition modelling cross-lingual wolastoqey-english definition Wolastoqey-English تعريف النمذجة تعريف النمذجة تعريف Wolastoqey عبر اللغات صناعة حمض الفوسفور المزيد..

Appropriate way to conclude contracts B.OT in Syria and the guarantees necessary for the conclusion

3367 - Aِl-Baath University 2014 ورقة بحثية

It is noticeable lag Syria in the passage of the law system BOT contracts despite that a lot of countries issued legislation on this type of contract, that's what made it necessary to look at appropriate ways to implement the system contracts BOT, t o stand on these roads and the possibility of development, divided Search to in the first two sections I set the appropriate way to conclude a contract under which the BOT, and the second oldest in the legal guarantees for the financing of projects, the BOT, a ring in search results and recommendations, which lays the foundation for the application of these contracts.

Syria سورية عقود B.O.T الضمانات القانونية B.OT contracts legal guarantees

Different Strokes for Different Folks: Investigating Appropriate Further Pre-training Approaches for Diverse Dialogue Tasks

353 - Association for Computation Linguistics 2021 مقالة

Loading models pre-trained on the large-scale corpus in the general domain and fine-tuning them on specific downstream tasks is gradually becoming a paradigm in Natural Language Processing. Previous investigations prove that introducing a further pre -training phase between pre-training and fine-tuning phases to adapt the model on the domain-specific unlabeled data can bring positive effects. However, most of these further pre-training works just keep running the conventional pre-training task, e.g., masked language model, which can be regarded as the domain adaptation to bridge the data distribution gap. After observing diverse downstream tasks, we suggest that different tasks may also need a further pre-training phase with appropriate training tasks to bridge the task formulation gap. To investigate this, we carry out a study for improving multiple task-oriented dialogue downstream tasks through designing various tasks at the further pre-training phase. The experiment shows that different downstream tasks prefer different further pre-training tasks, which have intrinsic correlation and most further pre-training tasks significantly improve certain target tasks rather than all. Our investigation indicates that it is of great importance and effectiveness to design appropriate further pre-training tasks modeling specific information that benefit downstream tasks. Besides, we present multiple constructive empirical conclusions for enhancing task-oriented dialogues.

دور الدلال المحادثة pre-training approaches نهج ما قبل التدريب صناعة حمض الفوسفور

Arabic Compact Language Modelling for Resource Limited Devices

403 - Association for Computation Linguistics 2021 مقالة

Natural language modelling has gained a lot of interest recently. The current state-of-the-art results are achieved by first training a very large language model and then fine-tuning it on multiple tasks. However, there is little work on smaller more compact language models for resource-limited devices or applications. Not to mention, how to efficiently train such models for a low-resource language like Arabic. In this paper, we investigate how such models can be trained in a compact way for Arabic. We also show how distillation and quantization can be applied to create even smaller models. Our experiments show that our largest model which is 2x smaller than the baseline can achieve better results on multiple tasks with 2x less data for pretraining.

resource limited devices resource limited limited devices أجهزة الموارد المحدودة الموارد المحدودة أجهزة محدودة صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Definition Modelling for Appropriate Specificity

تحديد النمذجة للحصول على الخصوصية المناسبة

Ask ChatGPT about the research

Read More

suggested questions