Do you want to publish a course? Click here

Unsupervised Neural Machine Translation with Universal Grammar

الترجمة الآلية العصبية غير الخاضعة لها مع قواعد اللغة العالمية

433   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

Machine translation usually relies on parallel corpora to provide parallel signals for training. The advent of unsupervised machine translation has brought machine translation away from this reliance, though performance still lags behind traditional supervised machine translation. In unsupervised machine translation, the model seeks symmetric language similarities as a source of weak parallel signal to achieve translation. Chomsky's Universal Grammar theory postulates that grammar is an innate form of knowledge to humans and is governed by universal principles and constraints. Therefore, in this paper, we seek to leverage such shared grammar clues to provide more explicit language parallel signals to enhance the training of unsupervised machine translation models. Through experiments on multiple typical language pairs, we demonstrate the effectiveness of our proposed approaches.



References used
https://aclanthology.org/
rate research

Read More

Non-autoregressive neural machine translation, which decomposes the dependence on previous target tokens from the inputs of the decoder, has achieved impressive inference speedup but at the cost of inferior accuracy. Previous works employ iterative d ecoding to improve the translation by applying multiple refinement iterations. However, a serious drawback is that these approaches expose the serious weakness in recognizing the erroneous translation pieces. In this paper, we propose an architecture named RewriteNAT to explicitly learn to rewrite the erroneous translation pieces. Specifically, RewriteNAT utilizes a locator module to locate the erroneous ones, which are then revised into the correct ones by a revisor module. Towards keeping the consistency of data distribution with iterative decoding, an iterative training strategy is employed to further improve the capacity of rewriting. Extensive experiments conducted on several widely-used benchmarks show that RewriteNAT can achieve better performance while significantly reducing decoding time, compared with previous iterative decoding strategies. In particular, RewriteNAT can obtain competitive results with autoregressive translation on WMT14 En-De, En-Fr and WMT16 Ro-En translation benchmarks.
Recently a number of approaches have been proposed to improve translation performance for document-level neural machine translation (NMT). However, few are focusing on the subject of lexical translation consistency. In this paper we apply one transla tion per discourse'' in NMT, and aim to encourage lexical translation consistency for document-level NMT. This is done by first obtaining a word link for each source word in a document, which tells the positions where the source word appears. Then we encourage the translation of those words within a link to be consistent in two ways. On the one hand, when encoding sentences within a document we properly share context information of those words. On the other hand, we propose an auxiliary loss function to better constrain that their translation should be consistent. Experimental results on Chinese↔English and English→French translation tasks show that our approach not only achieves state-of-the-art performance in BLEU scores, but also greatly improves lexical consistency in translation.
Scheduled sampling is widely used to mitigate the exposure bias problem for neural machine translation. Its core motivation is to simulate the inference scene during training by replacing ground-truth tokens with predicted tokens, thus bridging the g ap between training and inference. However, vanilla scheduled sampling is merely based on training steps and equally treats all decoding steps. Namely, it simulates an inference scene with uniform error rates, which disobeys the real inference scene, where larger decoding steps usually have higher error rates due to error accumulations. To alleviate the above discrepancy, we propose scheduled sampling methods based on decoding steps, increasing the selection chance of predicted tokens with the growth of decoding steps. Consequently, we can more realistically simulate the inference scene during training, thus better bridging the gap between training and inference. Moreover, we investigate scheduled sampling based on both training steps and decoding steps for further improvements. Experimentally, our approaches significantly outperform the Transformer baseline and vanilla scheduled sampling on three large-scale WMT tasks. Additionally, our approaches also generalize well to the text summarization task on two popular benchmarks.
Back-translation (BT) has become one of the de facto components in unsupervised neural machine translation (UNMT), and it explicitly makes UNMT have translation ability. However, all the pseudo bi-texts generated by BT are treated equally as clean da ta during optimization without considering the quality diversity, leading to slow convergence and limited translation performance. To address this problem, we propose a curriculum learning method to gradually utilize pseudo bi-texts based on their quality from multiple granularities. Specifically, we first apply crosslingual word embedding to calculate the potential translation difficulty (quality) for the monolingual sentences. Then, the sentences are fed into UNMT from easy to hard batch by batch. Furthermore, considering the quality of sentences/tokens in a particular batch are also diverse, we further adopt the model itself to calculate the fine-grained quality scores, which are served as learning factors to balance the contributions of different parts when computing loss and encourage the UNMT model to focus on pseudo data with higher quality. Experimental results on WMT 14 En-Fr, WMT 14 En-De, WMT 16 En-Ro, and LDC En-Zh translation tasks demonstrate that the proposed method achieves consistent improvements with faster convergence speed.
The paper presents experiments in neural machine translation with lexical constraints into a morphologically rich language. In particular and we introduce a method and based on constrained decoding and which handles the inflected forms of lexical ent ries and does not require any modification to the training data or model architecture. To evaluate its effectiveness and we carry out experiments in two different scenarios: general and domain-specific. We compare our method with baseline translation and i.e. translation without lexical constraints and in terms of translation speed and translation quality. To evaluate how well the method handles the constraints and we propose new evaluation metrics which take into account the presence and placement and duplication and inflectional correctness of lexical terms in the output sentence.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا