ﻻ يوجد ملخص باللغة العربية
Discourse coherence plays an important role in the translation of one text. However, the previous reported models most focus on improving performance over individual sentence while ignoring cross-sentence links and dependencies, which affects the coherence of the text. In this paper, we propose to use discourse context and reward to refine the translation quality from the discourse perspective. In particular, we generate the translation of individual sentences at first. Next, we deliberate the preliminary produced translations, and train the model to learn the policy that produces discourse coherent text by a reward teacher. Practical results on multiple discourse test datasets indicate that our model significantly improves the translation quality over the state-of-the-art baseline system by +1.23 BLEU score. Moreover, our model generates more discourse coherent text and obtains +2.2 BLEU improvements when evaluated by discourse metrics.
Existing neural machine translation (NMT) systems utilize sequence-to-sequence neural networks to generate target translation word by word, and then make the generated word at each time-step and the counterpart in the references as consistent as poss
Previous work on neural noisy channel modeling relied on latent variable models that incrementally process the source and target sentence. This makes decoding decisions based on partial source prefixes even though the full source is available. We pur
Cross-attention is an important component of neural machine translation (NMT), which is always realized by dot-product attention in previous methods. However, dot-product attention only considers the pair-wise correlation between words, resulting in
In Transformer-based neural machine translation (NMT), the positional encoding mechanism helps the self-attention networks to learn the source representation with order dependency, which makes the Transformer-based NMT achieve state-of-the-art result
Previous works have shown that contextual information can improve the performance of neural machine translation (NMT). However, most existing document-level NMT methods only consider a few number of previous sentences. How to make use of the whole do