New community

Subscribe to the gold package and get unlimited access to Shamra Academy

On the Difficulty of Translating Free-Order Case-Marking Languages

في صعوبة ترجمة لغات وضع علامات الترتيب الحرة

640 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Abstract Identifying factors that make certain languages harder to model than others is essential to reach language equality in future Natural Language Processing technologies. Free-order case-marking languages, such as Russian, Latin, or Tamil, have proved more challenging than fixed-order languages for the tasks of syntactic parsing and subject-verb agreement prediction. In this work, we investigate whether this class of languages is also more difficult to translate by state-of-the-art Neural Machine Translation (NMT) models. Using a variety of synthetic languages and a newly introduced translation challenge set, we find that word order flexibility in the source language only leads to a very small loss of NMT quality, even though the core verb arguments become impossible to disambiguate in sentences without semantic cues. The latter issue is indeed solved by the addition of case marking. However, in medium- and low-resource settings, the overall NMT quality of fixed-order languages remains unmatched.

References used

https://aclanthology.org/

rate research

On the Difficulty of Segmenting Words with Attention

266 - Association for Computation Linguistics 2021 مقالة

Word segmentation, the problem of finding word boundaries in speech, is of interest for a range of tasks. Previous papers have suggested that for sequence-to-sequence models trained on tasks such as speech translation or speech recognition, attention can be used to locate and segment the words. We show, however, that even on monolingual data this approach is brittle. In our experiments with different input types, data sizes, and segmentation algorithms, only models trained to predict phones from words succeed in the task. Models trained to predict words from either phones or speech (i.e., the opposite direction needed to generalize to new data), yield much worse results, suggesting that attention-based segmentation is only useful in limited scenarios.

difficulty of segmenting segmenting words difficulty صعوبة تجزئة تجزئة الكلمات صعوبة صناعة حمض الفوسفور المزيد..

Machine Translation of Low-Resource Indo-European Languages

660 - Association for Computation Linguistics 2021 مقالة

In this work, we investigate methods for the challenging task of translating between low- resource language pairs that exhibit some level of similarity. In particular, we consider the utility of transfer learning for translating between several Indo- European low-resource languages from the Germanic and Romance language families. In particular, we build two main classes of transfer-based systems to study how relatedness can benefit the translation performance. The primary system fine-tunes a model pre-trained on a related language pair and the contrastive system fine-tunes one pre-trained on an unrelated language pair. Our experiments show that although relatedness is not necessary for transfer learning to work, it does benefit model performance.

مهمة الترجمة الثلاثية germanic and romance الجرمانية والرومانسية صناعة حمض الفوسفور

The Mechanism of Operational Integration 1process between a Seaport and a Free Zone: a Case Study on Jebel Ali Free Zone

1915 - Tishreen University 2013 ورقة بحثية

This research discusses the operational integration mechanism between a seaport and a free zone, a case study on Jebel Ali Free Zone; selecting Jebel for its high performance and distinguished position achieved on both Arabic and International leve ls; definitions of the various types of free zones were introduced as well as a figure representing the mechanism of operational integration supported by a thorough analysis of the current cargo flow between Jebel Ali Free Zone and Port on various fields and levels, such as custom clearance, informatics such as Tradnet , Main, Mirsal and Electronic Data Interchange, and its role in creating such integration, a statistical hypothetical tests were conducted on the relationship among the number of containers handled at the port and free zone , exports , imports and Gross Domestic Product (GDP). The research concluded the importance of benefiting from Jebel Ali Free zone experience of integration between the seaport and free zone.

Gross Domestic Product imports الواردات المنطقة الحرة القوانين و النظم الجمركية المنطقة الحرة المرفئية الميناء البحري المنطقة الحرة فى جبل علي الناتج المحلي الاجمالي الصادرات غير النفطية أعداد الحاويات المكافئة Free Zone Custom Laws And Regulations Seaport Free Zone Sea Port Jebel Ali Free Zone Non-Oil Exports Number Of Containers Handled المزيد..

Improving Multilingual Neural Machine Translation with Auxiliary Source Languages

344 - Association for Computation Linguistics 2021 مقالة

Multilingual neural machine translation models typically handle one source language at a time. However, prior work has shown that translating from multiple source languages improves translation quality. Different from existing approaches on multi-sou rce translation that are limited to the test scenario where parallel source sentences from multiple languages are available at inference time, we propose to improve multilingual translation in a more common scenario by exploiting synthetic source sentences from auxiliary languages. We train our model on synthetic multi-source corpora and apply random masking to enable flexible inference with single-source or bi-source inputs. Extensive experiments on Chinese/English-Japanese and a large-scale multilingual translation benchmark show that our model outperforms the multilingual baseline significantly by up to +4.0 BLEU with the largest improvements on low-resource or distant language pairs.

لغة ملثمفة فعالة صناعة حمض الفوسفور

Exploring Task Difficulty for Few-Shot Relation Extraction

379 - Association for Computation Linguistics 2021 مقالة

Few-shot relation extraction (FSRE) focuses on recognizing novel relations by learning with merely a handful of annotated instances. Meta-learning has been widely adopted for such a task, which trains on randomly generated few-shot tasks to learn gen eric data representations. Despite impressive results achieved, existing models still perform suboptimally when handling hard FSRE tasks, where the relations are fine-grained and similar to each other. We argue this is largely because existing models do not distinguish hard tasks from easy ones in the learning process. In this paper, we introduce a novel approach based on contrastive learning that learns better representations by exploiting relation label information. We further design a method that allows the model to adaptively learn how to focus on hard tasks. Experiments on two standard datasets demonstrate the effectiveness of our method.

exploring task difficulty few-shot relation extraction استكشاف صعوبة المهمة استخراج علاقات قليلة صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

On the Difficulty of Translating Free-Order Case-Marking Languages

في صعوبة ترجمة لغات وضع علامات الترتيب الحرة

Ask ChatGPT about the research

Read More

suggested questions