ﻻ يوجد ملخص باللغة العربية
Previous works mainly focus on improving cross-lingual transfer for NLU tasks with multilingual pretrained encoder (MPE), or improving the translation performance on NMT task with BERT. However, how to improve the cross-lingual transfer of NMT model with multilingual pretrained encoder is under-explored. In this paper, we focus on a zero-shot cross-lingual transfer task in NMT. In this task, the NMT model is trained with one parallel dataset and an off-the-shelf MPE, then is directly tested on zero-shot language pairs. We propose SixT, a simple yet effective model for this task. The SixT model leverages the MPE with a two-stage training schedule and gets further improvement with a position disentangled encoder and a capacity-enhanced decoder. The extensive experiments prove that SixT significantly improves the translation quality of the unseen languages. With much less computation cost and training data, our model achieves better performance on many-to-English testsets than CRISS and m2m-100, two strong multilingual NMT baselines.
Transferring representations from large supervised tasks to downstream tasks has shown promising results in AI fields such as Computer Vision and Natural Language Processing (NLP). In parallel, the recent progress in Machine Translation (MT) has enab
Multilingual machine translation enables a single model to translate between different languages. Most existing multilingual machine translation systems adopt a randomly initialized Transformer backbone. In this work, inspired by the recent success o
Transfer learning between different language pairs has shown its effectiveness for Neural Machine Translation (NMT) in low-resource scenario. However, existing transfer methods involving a common target language are far from success in the extreme sc
We propose a simple solution to use a single Neural Machine Translation (NMT) model to translate between multiple languages. Our solution requires no change in the model architecture from our base system but instead introduces an artificial token at
The recently proposed massively multilingual neural machine translation (NMT) system has been shown to be capable of translating over 100 languages to and from English within a single model. Its improved translation performance on low resource langua