توضح هذه الورقة تقديم TENTRANS إلى مهمة مشتركة من Translation Translation منخفضة اللغات WMT21 لأزواج اللغة الرومانسية.تركز هذه المهمة على تحسين جودة الترجمة من الكاتالونية إلى Occitan والرومانية والإيطالية، بمساعدة لغات الموارد ذات الصلة ذات الصلة.نحن نستخدم أساسا الترجمة المرجانية، والطرق القائمة على المحور، ونماذج متعددة اللغات، ونقل النموذج المدربين مسبقا، ونقل المعرفة داخل المجال لتحسين جودة الترجمة.في مجموعة الاختبار، يحقق نظامنا الأفضل المقدم بمتوسط 43.45 درجات بلو حساسة لحالة الأحرف عبر جميع أزواج الموارد المنخفضة.تتوفر بياناتنا ورمز النماذج المدربة مسبقا مسبقا في هذا العمل في أمثلة تقييم Tentrans.
This paper describes TenTrans' submission to WMT21 Multilingual Low-Resource Translation shared task for the Romance language pairs. This task focuses on improving translation quality from Catalan to Occitan, Romanian and Italian, with the assistance of related high-resource languages. We mainly utilize back-translation, pivot-based methods, multilingual models, pre-trained model fine-tuning, and in-domain knowledge transfer to improve the translation quality. On the test set, our best-submitted system achieves an average of 43.45 case-sensitive BLEU scores across all low-resource pairs. Our data, code, and pre-trained models used in this work are available in TenTrans evaluation examples.
References used
https://aclanthology.org/
This paper describes Charles University sub-mission for Terminology translation shared task at WMT21. The objective of this task is to design a system which translates certain terms based on a provided terminology database, while preserving high over
In this work, we investigate methods for the challenging task of translating between low- resource language pairs that exhibit some level of similarity. In particular, we consider the utility of transfer learning for translating between several Indo-
This paper describes TenTrans large-scale multilingual machine translation system for WMT 2021. We participate in the Small Track 2 in five South East Asian languages, thirty directions: Javanese, Indonesian, Malay, Tagalog, Tamil, English. We mainly
This paper describes the participation of the BSC team in the WMT2021's Multilingual Low-Resource Translation for Indo-European Languages Shared Task. The system aims to solve the Subtask 2: Wikipedia cultural heritage articles, which involves transl
We present our submissions to the WMT21 shared task in Unsupervised and Very Low Resource machine translation between German and Upper Sorbian, German and Lower Sorbian, and Russian and Chuvash. Our low-resource systems (German↔Upper Sorbian, Russian