يتم استخدام نماذج ما بعد التحرير التلقائي (APE) مخرجات نظام الترجمة الآلية (MT) الصحيحة عن طريق التعلم من أنماط ما بعد التحرير البشري.نقدم النظام المستخدم في التقديم الخاص بنا إلى المهمة المشتركة (APE) APانية (EN-DE).نستفيد نظام MT الحديث (NG et al.، 2019) لهذه المهمة.للحصول على مزيد من التحسينات، نقوم بتكييف نموذج MT إلى مجال المهام باستخدام Wikimatrix (Schwenket al.، 2021) متبوعا بضبط جيد مع عينات إضافية للقرد من الإصدارات السابقة للمهمة المشتركة (WMT-16،17،18) وتمتلكنماذج.تغلب أنظمتنا على خط الأساس على درجات TER على مجموعة اختبار WMT'21.
Automatic post-editing (APE) models are usedto correct machine translation (MT) system outputs by learning from human post-editing patterns. We present the system used in our submission to the WMT'21 Automatic Post-Editing (APE) English-German (En-De) shared task. We leverage the state-of-the-art MT system (Ng et al., 2019) for this task. For further improvements, we adapt the MT model to the task domain by using WikiMatrix (Schwenket al., 2021) followed by fine-tuning with additional APE samples from previous editions of the shared task (WMT-16,17,18) and ensembling the models. Our systems beat the baseline on TER scores on the WMT'21 test set.
References used
https://aclanthology.org/
The development of Translation Technologies, like Translation Memory and Machine Translation, has completely changed the translation industry and translator's workflow in the last decades. Nevertheless, TM and MT have been developed separately until
This paper introduces data on translation trainees' perceptions of the MTPE process and implications on training in this field. This study aims to analyse trainees' performance of three MTPE tasks the English-Polish language pair and post-tasks inter
Accurate translation requires document-level information, which is ignored by sentence-level machine translation. Recent work has demonstrated that document-level consistency can be improved with automatic post-editing (APE) using only target-languag
The neural machine translation approach has gained popularity in machine translation because of its context analysing ability and its handling of long-term dependency issues. We have participated in the WMT21 shared task of similar language translati
Many NLP models operate over sequences of subword tokens produced by hand-crafted tokenization rules and heuristic subword induction algorithms. A simple universal alternative is to represent every computerized text as a sequence of bytes via UTF-8,