Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

TenTrans Large-Scale Multilingual Machine Translation System for WMT21

tentrans نظام الترجمة متعددة اللغات على نطاق واسع ل WMT21

516 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper describes TenTrans large-scale multilingual machine translation system for WMT 2021. We participate in the Small Track 2 in five South East Asian languages, thirty directions: Javanese, Indonesian, Malay, Tagalog, Tamil, English. We mainly utilized forward/back-translation, in-domain data selection, knowledge distillation, and gradual fine-tuning from the pre-trained model FLORES-101. We find that forward/back-translation significantly improves the translation results, data selection and gradual fine-tuning are particularly effective during adapting domain, while knowledge distillation brings slight performance improvement. Also, model averaging is used to further improve the translation performance based on these systems. Our final system achieves an average BLEU score of 28.89 across thirty directions on the test set.

References used

https://aclanthology.org/

rate research

Maastricht University's Large-Scale Multilingual Machine Translation System for WMT 2021

741 - Association for Computation Linguistics 2021 مقالة

We present our development of the multilingual machine translation system for the large-scale multilingual machine translation task at WMT 2021. Starting form the provided baseline system, we investigated several techniques to improve the translation quality on the target subset of languages. We were able to significantly improve the translation quality by adapting the system towards the target subset of languages and by generating synthetic data using the initial model. Techniques successfully applied in zero-shot multilingual machine translation (e.g. similarity regularizer) only had a minor effect on the final translation performance.

متعدد اللغات منخفضة الموارد maastricht university large-scale university large-scale multilingual جامعة ماستريخت واسعة النطاق جامعة واسعة النطاق متعدد اللغات صناعة حمض الفوسفور

TenTrans Multilingual Low-Resource Translation System for WMT21 Indo-European Languages Task

382 - Association for Computation Linguistics 2021 مقالة

This paper describes TenTrans' submission to WMT21 Multilingual Low-Resource Translation shared task for the Romance language pairs. This task focuses on improving translation quality from Catalan to Occitan, Romanian and Italian, with the assistance of related high-resource languages. We mainly utilize back-translation, pivot-based methods, multilingual models, pre-trained model fine-tuning, and in-domain knowledge transfer to improve the translation quality. On the test set, our best-submitted system achieves an average of 43.45 case-sensitive BLEU scores across all low-resource pairs. Our data, code, and pre-trained models used in this work are available in TenTrans evaluation examples.

متعددة اللغات NMT. indo-european languages task multilingual low-resource مهمة اللغات الهندية الأوروبية متعدد اللغات منخفضة الموارد صناعة حمض الفوسفور

Back-translation for Large-Scale Multilingual Machine Translation

347 - Association for Computation Linguistics 2021 مقالة

This paper illustrates our approach to the shared task on large-scale multilingual machine translation in the sixth conference on machine translation (WMT-21). In this work, we aim to build a single multilingual translation system with a hypothesis t hat a universal cross-language representation leads to better multilingual translation performance. We extend the exploration of different back-translation methods from bilingual translation to multilingual translation. Better performance is obtained by the constrained sampling method, which is different from the finding of the bilingual translation. Besides, we also explore the effect of vocabularies and the amount of synthetic data. Surprisingly, the smaller size of vocabularies perform better, and the extensive monolingual English data offers a modest improvement. We submitted to both the small tasks and achieve the second place.

متعدد اللغات منخفضة الموارد صناعة حمض الفوسفور

The Mininglamp Machine Translation System for WMT21

507 - Association for Computation Linguistics 2021 مقالة

This paper describes Mininglamp neural machine translation systems of the WMT2021 news translation tasks. We have participated in eight directions translation tasks for news text including Chinese to/from English, Hausa to/from English, German to/fro m English and French to/from German. Our fundamental system was based on Transformer architecture, with wider or smaller construction for different news translation tasks. We mainly utilized the method of back-translation, knowledge distillation and fine-tuning to boost single model, while the ensemble was used to combine single models. Our final submission has ranked first for the English to/from Hausa task.

mininglamp machine translation machine translation system mininglamp neural machine ترجمة آلة MiningLamp. نظام الترجمة الآلية mininglamp الآلة العصبية صناعة حمض الفوسفور المزيد..

Findings of the WMT 2021 Shared Task on Large-Scale Multilingual Machine Translation

423 - Association for Computation Linguistics 2021 مقالة

We present the results of the first task on Large-Scale Multilingual Machine Translation. The task consists on the many-to-many evaluation of a single model across a variety of source and target languages. This year, the task consisted on three diffe rent settings: (i) SMALL-TASK1 (Central/South-Eastern European Languages), (ii) the SMALL-TASK2 (South-East Asian Languages), and (iii) FULL-TASK (all 101 x 100 language pairs). All the tasks used the FLORES-101 dataset as the evaluation benchmark. To ensure the longevity of the dataset, the test sets were not publicly released and the models were evaluated in a controlled environment on Dynabench. There were a total of 10 participating teams for the tasks, with a total of 151 intermediate model submissions and 13 final models. This year's result show a significant improvement over the known base-lines with +17.8 BLEU for SMALL-TASK2, +10.6 for FULL-TASK and +3.6 for SMALL-TASK1.

multilingual machine translation large-scale multilingual machine ترجمة الجهاز متعدد اللغات آلة متعددة اللغات على نطاق واسع صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

TenTrans Large-Scale Multilingual Machine Translation System for WMT21

tentrans نظام الترجمة متعددة اللغات على نطاق واسع ل WMT21

Ask ChatGPT about the research

Read More

suggested questions