تصف هذه الورقة نظامنا (معرف الفريق: Nictrb) للمشاركة في مهمة الترجمة الآلية المحظورة Wat'21.في نظامنا المقدم، صممنا نهج تدريب جديد للترجمة الآلية المحظورة.بواسطة أخذ العينات من هدف الترجمة، يمكننا حل المشكلة التي لا تملك بيانات التدريب العادية مفردات مقيدة.مع مزيد من المساعدة في فك التشفير المقيد في مرحلة الاستدلال، حققنا نتائج أفضل من الأساس، مما يؤكد فعالية حلنا.بالإضافة إلى ذلك، حاولنا أيضا محول الفانيليا والخريج كشبكة العمود الفقري للنموذج، بالإضافة إلى إعاقة نموذجية، مما أدى إلى تحسين أداء الترجمة النهائي.
This paper describes our system (Team ID: nictrb) for participating in the WAT'21 restricted machine translation task. In our submitted system, we designed a new training approach for restricted machine translation. By sampling from the translation target, we can solve the problem that ordinary training data does not have a restricted vocabulary. With the further help of constrained decoding in the inference phase, we achieved better results than the baseline, confirming the effectiveness of our solution. In addition, we also tried the vanilla and sparse Transformer as the backbone network of the model, as well as model ensembling, which further improved the final translation performance.
References used
https://aclanthology.org/
This paper describes the Global Tone Communication Co., Ltd.'s submission of the WMT21 shared news translation task. We participate in six directions: English to/from Hausa, Hindi to/from Bengali and Zulu to/from Xhosa. Our submitted systems are unco
This paper describes NiuTrans neural machine translation systems of the WMT 2021 news translation tasks. We made submissions to 9 language directions, including English2Chinese, Japanese, Russian, Icelandic and English2Hausa tasks. Our primary system
In this work, two Neural Machine Translation (NMT) systems have been developed and evaluated as part of the bidirectional Tamil-Telugu similar languages translation subtask in WMT21. The OpenNMT-py toolkit has been used to create quick prototypes of
In this paper, we (team - oneNLP-IIITH) describe our Neural Machine Translation approaches for English-Marathi (both direction) for LoResMT-20211 . We experimented with transformer based Neural Machine Translation and explored the use of different li
Many NLP models operate over sequences of subword tokens produced by hand-crafted tokenization rules and heuristic subword induction algorithms. A simple universal alternative is to represent every computerized text as a sequence of bytes via UTF-8,