New community

Subscribe to the gold package and get unlimited access to Shamra Academy

TenTrans Multilingual Low-Resource Translation System for WMT21 Indo-European Languages Task

Tentrans نظام الترجمة المنخفضة الموارد متعددة اللغات لمهمة لغات WMT21 الهندية الهندية

322 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

متعددة اللغات NMT. indo-european languages task multilingual low-resource مهمة اللغات الهندية الأوروبية متعدد اللغات منخفضة الموارد صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper describes TenTrans' submission to WMT21 Multilingual Low-Resource Translation shared task for the Romance language pairs. This task focuses on improving translation quality from Catalan to Occitan, Romanian and Italian, with the assistance of related high-resource languages. We mainly utilize back-translation, pivot-based methods, multilingual models, pre-trained model fine-tuning, and in-domain knowledge transfer to improve the translation quality. On the test set, our best-submitted system achieves an average of 43.45 case-sensitive BLEU scores across all low-resource pairs. Our data, code, and pre-trained models used in this work are available in TenTrans evaluation examples.

References used

https://aclanthology.org/

rate research

CUNI systems for WMT21: Multilingual Low-Resource Translation for Indo-European Languages Shared Task

637 - Association for Computation Linguistics 2021 مقالة

This paper describes Charles University sub-mission for Terminology translation shared task at WMT21. The objective of this task is to design a system which translates certain terms based on a provided terminology database, while preserving high over all translation quality. We competed in English-French language pair. Our approach is based on providing the desired translations alongside the input sentence and training the model to use these provided terms. We lemmatize the terms both during the training and inference, to allow the model to learn how to produce correct surface forms of the words, when they differ from the forms provided in the terminology database.

multilingual low-resource translation indo-european languages shared languages shared task الترجمة متعددة اللغات منخفضة الموارد اللغات الهندية الأوروبية مشتركة المهام المشتركة لغات صناعة حمض الفوسفور المزيد..

Machine Translation of Low-Resource Indo-European Languages

659 - Association for Computation Linguistics 2021 مقالة

In this work, we investigate methods for the challenging task of translating between low- resource language pairs that exhibit some level of similarity. In particular, we consider the utility of transfer learning for translating between several Indo- European low-resource languages from the Germanic and Romance language families. In particular, we build two main classes of transfer-based systems to study how relatedness can benefit the translation performance. The primary system fine-tunes a model pre-trained on a related language pair and the contrastive system fine-tunes one pre-trained on an unrelated language pair. Our experiments show that although relatedness is not necessary for transfer learning to work, it does benefit model performance.

مهمة الترجمة الثلاثية germanic and romance الجرمانية والرومانسية صناعة حمض الفوسفور

TenTrans Large-Scale Multilingual Machine Translation System for WMT21

461 - Association for Computation Linguistics 2021 مقالة

This paper describes TenTrans large-scale multilingual machine translation system for WMT 2021. We participate in the Small Track 2 in five South East Asian languages, thirty directions: Javanese, Indonesian, Malay, Tagalog, Tamil, English. We mainly utilized forward/back-translation, in-domain data selection, knowledge distillation, and gradual fine-tuning from the pre-trained model FLORES-101. We find that forward/back-translation significantly improves the translation results, data selection and gradual fine-tuning are particularly effective during adapting domain, while knowledge distillation brings slight performance improvement. Also, model averaging is used to further improve the translation performance based on these systems. Our final system achieves an average BLEU score of 28.89 across thirty directions on the test set.

مقياس كبير متعدد اللغات tentrans large-scale multilingual tentrans على نطاق واسع متعدد اللغات صناعة حمض الفوسفور

Transfer Learning with Shallow Decoders: BSC at WMT2021's Multilingual Low-Resource Translation for Indo-European Languages Shared Task

248 - Association for Computation Linguistics 2021 مقالة

This paper describes the participation of the BSC team in the WMT2021's Multilingual Low-Resource Translation for Indo-European Languages Shared Task. The system aims to solve the Subtask 2: Wikipedia cultural heritage articles, which involves transl ation in four Romance languages: Catalan, Italian, Occitan and Romanian. The submitted system is a multilingual semi-supervised machine translation model. It is based on a pre-trained language model, namely XLM-RoBERTa, that is later fine-tuned with parallel data obtained mostly from OPUS. Unlike other works, we only use XLM to initialize the encoder and randomly initialize a shallow decoder. The reported results are robust and perform well for all tested languages.

المهام المشتركة لغات صناعة حمض الفوسفور

The LMU Munich Systems for the WMT21 Unsupervised and Very Low-Resource Translation Task

358 - Association for Computation Linguistics 2021 مقالة

We present our submissions to the WMT21 shared task in Unsupervised and Very Low Resource machine translation between German and Upper Sorbian, German and Lower Sorbian, and Russian and Chuvash. Our low-resource systems (German↔Upper Sorbian, Russian ↔Chuvash) are pre-trained on high-resource pairs of related languages. We fine-tune those systems using the available authentic parallel data and improve by iterated back-translation. The unsupervised German↔Lower Sorbian system is initialized by the best Upper Sorbian system and improved by iterated back-translation using monolingual data only.

lmu munich systems lmu munich upper sorbian نظم LMU ميونيخ LMU ميونيخ السامية العليا صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

TenTrans Multilingual Low-Resource Translation System for WMT21 Indo-European Languages Task

Tentrans نظام الترجمة المنخفضة الموارد متعددة اللغات لمهمة لغات WMT21 الهندية الهندية

Ask ChatGPT about the research

Read More

suggested questions