Research papers, master and doctoral theses about neural machine

Contrastive Learning for Context-aware Neural Machine Translation Using Coreference Information

670 - Association for Computation Linguistics 2021 مقالة

Context-aware neural machine translation (NMT) incorporates contextual information of surrounding texts, that can improve the translation quality of document-level machine translation. Many existing works on context-aware NMT have focused on developi ng new model architectures for incorporating additional contexts and have shown some promising results. However, most of existing works rely on cross-entropy loss, resulting in limited use of contextual information. In this paper, we propose CorefCL, a novel data augmentation and contrastive learning scheme based on coreference between the source and contextual sentences. By corrupting automatically detected coreference mentions in the contextual sentence, CorefCL can train the model to be sensitive to coreference inconsistency. We experimented with our method on common context-aware NMT models and two document-level translation tasks. In the experiments, our method consistently improved BLEU of compared models on English-German and English-Korean tasks. We also show that our method significantly improves coreference resolution in the English-German contrastive test suite.

آلة العصبية المتزامنة context-aware neural machine السياق علم الجهاز العصبي صناعة حمض الفوسفور

Adapting Neural Machine Translation for Automatic Post-Editing

639 - Association for Computation Linguistics 2021 مقالة

Automatic post-editing (APE) models are usedto correct machine translation (MT) system outputs by learning from human post-editing patterns. We present the system used in our submission to the WMT'21 Automatic Post-Editing (APE) English-German (En-De ) shared task. We leverage the state-of-the-art MT system (Ng et al., 2019) for this task. For further improvements, we adapt the MT model to the task domain by using WikiMatrix (Schwenket al., 2021) followed by fine-tuning with additional APE samples from previous editions of the shared task (WMT-16,17,18) and ensembling the models. Our systems beat the baseline on TER scores on the WMT'21 test set.

adapting neural machine adapting neural تكييف الجهاز العصبي تكييف العصبي صناعة حمض الفوسفور

GTCOM Neural Machine Translation Systems for WMT21

811 - Association for Computation Linguistics 2021 مقالة

This paper describes the Global Tone Communication Co., Ltd.'s submission of the WMT21 shared news translation task. We participate in six directions: English to/from Hausa, Hindi to/from Bengali and Zulu to/from Xhosa. Our submitted systems are unco nstrained and focus on multilingual translation odel, backtranslation and forward-translation. We also apply rules and language model to filter monolingual, parallel sentences and synthetic sentences.

gtcom neural machine gtcom neural GTCOM الآلة العصبية ترجمة الآلة العصبية gtcom العصبية صناعة حمض الفوسفور

Uncertainty-Aware Balancing for Multilingual and Multi-Domain Neural Machine Translation Training

930 - Association for Computation Linguistics 2021 مقالة

Learning multilingual and multi-domain translation model is challenging as the heterogeneous and imbalanced data make the model converge inconsistently over different corpora in real world. One common practice is to adjust the share of each corpus in the training, so that the learning process is balanced and low-resource cases can benefit from the high resource ones. However, automatic balancing methods usually depend on the intra- and inter-dataset characteristics, which is usually agnostic or requires human priors. In this work, we propose an approach, MultiUAT, that dynamically adjusts the training data usage based on the model's uncertainty on a small set of trusted clean data for multi-corpus machine translation. We experiments with two classes of uncertainty measures on multilingual (16 languages with 4 settings) and multi-domain settings (4 for in-domain and 2 for out-of-domain on English-German translation) and demonstrate our approach MultiUAT substantially outperforms its baselines, including both static and dynamic strategies. We analyze the cross-domain transfer and show the deficiency of static and similarity based methods.

multi-domain neural machine multi-domain neural آلة متعددة المجال العصبية متعدد المجال العصبي صناعة حمض الفوسفور

The Mininglamp Machine Translation System for WMT21

885 - Association for Computation Linguistics 2021 مقالة

This paper describes Mininglamp neural machine translation systems of the WMT2021 news translation tasks. We have participated in eight directions translation tasks for news text including Chinese to/from English, Hausa to/from English, German to/fro m English and French to/from German. Our fundamental system was based on Transformer architecture, with wider or smaller construction for different news translation tasks. We mainly utilized the method of back-translation, knowledge distillation and fine-tuning to boost single model, while the ensemble was used to combine single models. Our final submission has ranked first for the English to/from Hausa task.

mininglamp machine translation machine translation system mininglamp neural machine ترجمة آلة MiningLamp. نظام الترجمة الآلية mininglamp الآلة العصبية صناعة حمض الفوسفور المزيد..

MiSS@WMT21: Contrastive Learning-reinforced Domain Adaptation in Neural Machine Translation

898 - Association for Computation Linguistics 2021 مقالة

In this paper, we describe our MiSS system that participated in the WMT21 news translation task. We mainly participated in the evaluation of the three translation directions of English-Chinese and Japanese-English translation tasks. In the systems su bmitted, we primarily considered wider networks, deeper networks, relative positional encoding, and dynamic convolutional networks in terms of model structure, while in terms of training, we investigated contrastive learning-reinforced domain adaptation, self-supervised training, and optimization objective switching training methods. According to the final evaluation results, a deeper, wider, and stronger network can improve translation performance in general, yet our data domain adaption method can improve performance even more. In addition, we found that switching to the use of our proposed objective during the finetune phase using relatively small domain-related data can effectively improve the stability of the model's convergence and achieve better optimal performance.

مهمة neural machine learning-reinforced domain adaptation الآلة العصبية التعلم التعزيز التكيف صناعة حمض الفوسفور

Simultaneous Neural Machine Translation with Constituent Label Prediction

590 - Association for Computation Linguistics 2021 مقالة

Simultaneous translation is a task in which translation begins before the speaker has finished speaking, so it is important to decide when to start the translation process. However, deciding whether to read more input words or start to translate is d ifficult for language pairs with different word orders such as English and Japanese. Motivated by the concept of pre-reordering, we propose a couple of simple decision rules using the label of the next constituent predicted by incremental constituent label prediction. In experiments on English-to-Japanese simultaneous translation, the proposed method outperformed baselines in the quality-latency trade-off.

simultaneous neural machine آلة العصبية المتزامنة صناعة حمض الفوسفور

AligNART: Non-autoregressive Neural Machine Translation by Jointly Learning to Estimate Alignment and Translate

891 - Association for Computation Linguistics 2021 مقالة

Non-autoregressive neural machine translation (NART) models suffer from the multi-modality problem which causes translation inconsistency such as token repetition. Most recent approaches have attempted to solve this problem by implicitly modeling dep endencies between outputs. In this paper, we introduce AligNART, which leverages full alignment information to explicitly reduce the modality of the target distribution. AligNART divides the machine translation task into (i) alignment estimation and (ii) translation with aligned decoder inputs, guiding the decoder to focus on simplified one-to-one translation. To alleviate the alignment estimation problem, we further propose a novel alignment decomposition method. Our experiments show that AligNART outperforms previous non-iterative NART models that focus on explicit modality reduction on WMT14 En↔De and WMT16 Ro→En. Furthermore, AligNART achieves BLEU scores comparable to those of the state-of-the-art connectionist temporal classification based models on WMT14 En↔De. We also observe that AligNART effectively addresses the token repetition problem even without sequence-level knowledge distillation.

non-autoregressive neural machine jointly learning learning to estimate الجهاز العصبي غير التلقائي التعلم المشترك تعلم تقدير صناعة حمض الفوسفور المزيد..

Data and Parameter Scaling Laws for Neural Machine Translation

802 - Association for Computation Linguistics 2021 مقالة

We observe that the development cross-entropy loss of supervised neural machine translation models scales like a power law with the amount of training data and the number of non-embedding parameters in the model. We discuss some practical implication s of these results, such as predicting BLEU achieved by large scale models and predicting the ROI of labeling data in low-resource language pairs.

ترجمة المصطلحات العصبية parameter scaling laws supervised neural machine القوانين المعلمة القياس آلة عصبية خاضعة للإشراف صناعة حمض الفوسفور

Relying on Discourse Analysis to Answer Complex Questions by Neural Machine Reading Comprehension

656 - Association for Computation Linguistics 2021 مقالة

Machine reading comprehension (MRC) is one of the most challenging tasks in natural language processing domain. Recent state-of-the-art results for MRC have been achieved with the pre-trained language models, such as BERT and its modifications. Despi te the high performance of these models, they still suffer from the inability to retrieve correct answers from the detailed and lengthy passages. In this work, we introduce a novel scheme for incorporating the discourse structure of the text into a self-attention network, and, thus, enrich the embedding obtained from the standard BERT encoder with the additional linguistic knowledge. We also investigate the influence of different types of linguistic information on the model's ability to answer complex questions that require deep understanding of the whole text. Experiments performed on the SQuAD benchmark and more complex question answering datasets have shown that linguistic enhancing boosts the performance of the standard BERT model significantly.

تحديد اللغة الهجومية neural machine reading آلة القراءة العصبية صناعة حمض الفوسفور