تصف هذه الورقة تقديم Kakao Enterprise إلى الترجمة الآلية المشتركة WMT21 باستخدام مهمة المصطلحات.نحن ندمج قيود المصطلحات من خلال التدريب المسبق مع التوضيحات المستهدفة التلقائية والضبط بشكل جيد مع التعليقات التوضيحية المستهدفة الدقيقة باستخدام مجموعة بيانات المصطلحات المحددة.ينتجج هذا النهج نموذجا يحقق نتائج رائعة من حيث جودة الترجمة واتساق الأجل، المرتبة الأولى بناء على المذنب في اتجاه لغة EN → FR.علاوة على ذلك، نستكشف أساليب مختلفة مثل الترجمة الخلفية، ومصطلحات تدريب صريحة كبيانات متوازية إضافية، واختيار بيانات داخل المجال.
This paper describes Kakao Enterprise's submission to the WMT21 shared Machine Translation using Terminologies task. We integrate terminology constraints by pre-training with target lemma annotations and fine-tuning with exact target annotations utilizing the given terminology dataset. This approach yields a model that achieves outstanding results in terms of both translation quality and term consistency, ranking first based on COMET in the En→Fr language direction. Furthermore, we explore various methods such as back-translation, explicitly training terminologies as additional parallel data, and in-domain data selection.
References used
https://aclanthology.org/
This paper describes our work in the WMT 2021 Machine Translation using Terminologies Shared Task. We participate in the shared translation terminologies task in English to Chinese language pair. To satisfy terminology constraints on translation, we
Language domains that require very careful use of terminology are abundant and reflect a significant part of the translation industry. In this work we introduce a benchmark for evaluating the quality and consistency of terminology translation, focusi
This paper describes NiuTrans neural machine translation systems of the WMT 2021 news translation tasks. We made submissions to 9 language directions, including English2Chinese, Japanese, Russian, Icelandic and English2Hausa tasks. Our primary system
This paper describes the Global Tone Communication Co., Ltd.'s submission of the WMT21 shared news translation task. We participate in six directions: English to/from Hausa, Hindi to/from Bengali and Zulu to/from Xhosa. Our submitted systems are unco
This paper describes Lingua Custodia's submission to the WMT21 shared task on machine translation using terminologies. We consider three directions, namely English to French, Russian, and Chinese. We rely on a Transformer-based architecture as a buil