Do you want to publish a course? Click here

How low is too low? A monolingual take on lemmatisation in Indian languages

كيف منخفض منخفض جدا؟تأخذ أحادية اللون على Lemmatisation باللغات الهندية

264   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

Lemmatization aims to reduce the sparse data problem by relating the inflected forms of a word to its dictionary form. Most prior work on ML based lemmatization has focused on high resource languages, where data sets (word forms) are readily available. For languages which have no linguistic work available, especially on morphology or in languages where the computational realization of linguistic rules is complex and cumbersome, machine learning based lemmatizers are the way togo. In this paper, we devote our attention to lemmatisation for low resource, morphologically rich scheduled Indian languages using neural methods. Here, low resource means only a small number of word forms are available. We perform tests to analyse the variance in monolingual models' performance on varying the corpus size and contextual morphological tag data for training. We show that monolingual approaches with data augmentation can give competitive accuracy even in the low resource setting, which augurs well for NLP in low resource setting.



References used
https://aclanthology.org/
rate research

Read More

The aim of this research is to study a simplified approach for the design of low-noise bipolar transimpedance preamplifiers for optical receivers. Analytical solutions for optimum biasing and minimum equivalent input-noise current were derived. The study was achieved by doing comparison between the designed circuits. The equivalent input noise current was calculated by entering the parameters in Matlab program and using Multisim as a simulation tool to detect a pulse signal of 30ns width.
This paper aims at studying the influence of gangue aiid researching in the theoretical empirical methods in order to limit the formation of such gangue during the manufacture of steel. Gangues are divided to acids, sulfate and compound gangue. Su ch gangues are known for their bad influence on the mechanical characteristics of steel. Experiments have proved that the influence of Sulfur gangue especially FeS is worse than that of acids, as the melting point is 880t. Such gangue form capillary ducts within steel, thus there will be internal micro cleaves and stress points that will lead to collapse and break when the metal is exposed to stress less than the flowing point which is considered within the allowed stress values.
In this work, we investigate methods for the challenging task of translating between low- resource language pairs that exhibit some level of similarity. In particular, we consider the utility of transfer learning for translating between several Indo- European low-resource languages from the Germanic and Romance language families. In particular, we build two main classes of transfer-based systems to study how relatedness can benefit the translation performance. The primary system fine-tunes a model pre-trained on a related language pair and the contrastive system fine-tunes one pre-trained on an unrelated language pair. Our experiments show that although relatedness is not necessary for transfer learning to work, it does benefit model performance.
This paper describes TenTrans' submission to WMT21 Multilingual Low-Resource Translation shared task for the Romance language pairs. This task focuses on improving translation quality from Catalan to Occitan, Romanian and Italian, with the assistance of related high-resource languages. We mainly utilize back-translation, pivot-based methods, multilingual models, pre-trained model fine-tuning, and in-domain knowledge transfer to improve the translation quality. On the test set, our best-submitted system achieves an average of 43.45 case-sensitive BLEU scores across all low-resource pairs. Our data, code, and pre-trained models used in this work are available in TenTrans evaluation examples.
In this research, has been studied the spread of chromium atoms mechanism, and mechanical and chemical and properties of the diffusion chrome coating layer in low carbon steel, which is considered one of the surface treatment techniques. Where many p ractical experiences were carried out in the powdery saturation milieu to form a diffusion coating layer containing atomic chromium who will spread inside the painted surface, and has studied some mechanical and chemical properties after doing diffusion chrome coating. The tests results showed that the tensile strength and micro hardness and chemical corrosion resistance improved after the diffusion chrome coating, also increased the depth of coating layer By increasing the retention time in the oven and temperature, where this relationship is reflected with curve of the second degree. Conversely ductility decreased. Search results confirm the possibility of using diffusion chrome coating as a promising treatment in raising the efficiency of machinery elements that prone to oxidation or chemical corrosion in different temperatures.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا