New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Counterfactual Data Augmentation for Neural Machine Translation

تكبير البيانات المضادة للترجمة الآلية العصبية

319 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

القدرة على الاحترام صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We propose a data augmentation method for neural machine translation. It works by interpreting language models and phrasal alignment causally. Specifically, it creates augmented parallel translation corpora by generating (path-specific) counterfactual aligned phrases. We generate these by sampling new source phrases from a masked language model, then sampling an aligned counterfactual target phrase by noting that a translation language model can be interpreted as a Gumbel-Max Structural Causal Model (Oberst and Sontag, 2019). Compared to previous work, our method takes both context and alignment into account to maintain the symmetry between source and target sequences. Experiments on IWSLT'15 English → Vietnamese, WMT'17 English → German, WMT'18 English → Turkish, and WMT'19 robust English → French show that the method can improve the performance of translation, backtranslation and translation robustness.

References used

https://aclanthology.org/

rate research

Neural Machine Translation without Embeddings

381 - Association for Computation Linguistics 2021 مقالة

Many NLP models operate over sequences of subword tokens produced by hand-crafted tokenization rules and heuristic subword induction algorithms. A simple universal alternative is to represent every computerized text as a sequence of bytes via UTF-8, obviating the need for an embedding layer since there are fewer token types (256) than dimensions. Surprisingly, replacing the ubiquitous embedding layer with one-hot representations of each byte does not hurt performance; experiments on byte-to-byte machine translation from English to 10 different languages show a consistent improvement in BLEU, rivaling character-level and even standard subword-level models. A deeper investigation reveals that the combination of embeddingless models with decoder-input dropout amounts to token dropout, which benefits byte-to-byte models in particular.

القدرة على الاحترام صناعة حمض الفوسفور

mixSeq: A Simple Data Augmentation Methodfor Neural Machine Translation

437 - Association for Computation Linguistics 2021 مقالة

Data augmentation, which refers to manipulating the inputs (e.g., adding random noise,masking specific parts) to enlarge the dataset,has been widely adopted in machine learning. Most data augmentation techniques operate on a single input, which limit s the diversity of the training corpus. In this paper, we propose a simple yet effective data augmentation technique for neural machine translation, mixSeq, which operates on multiple inputs and their corresponding targets. Specifically, we randomly select two input sequences,concatenate them together as a longer input aswell as their corresponding target sequencesas an enlarged target, and train models on theaugmented dataset. Experiments on nine machine translation tasks demonstrate that such asimple method boosts the baselines by a non-trivial margin. Our method can be further combined with single input based data augmentation methods to obtain further improvements.

augmentation methodfor neural data augmentation methodfor methodfor neural machine طريقة تكبير للجدل طريقة تكبير البيانات ل طريقة للآلة العصبية صناعة حمض الفوسفور المزيد..

Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation

465 - Association for Computation Linguistics 2021 مقالة

Successful methods for unsupervised neural machine translation (UNMT) employ cross-lingual pretraining via self-supervision, often in the form of a masked language modeling or a sequence generation task, which requires the model to align the lexical- and high-level representations of the two languages. While cross-lingual pretraining works for similar languages with abundant corpora, it performs poorly in low-resource and distant languages. Previous research has shown that this is because the representations are not sufficiently aligned. In this paper, we enhance the bilingual masked language model pretraining with lexical-level information by using type-level cross-lingual subword embeddings. Empirical results demonstrate improved performance both on UNMT (up to 4.5 BLEU) and bilingual lexicon induction using our method compared to a UNMT baseline.

الفلتره unsupervised neural machine ability of pretrained الآلة العصبية غير المنشأة القدرة على الاحترام صناعة حمض الفوسفور

Sentence Concatenation Approach to Data Augmentation for Neural Machine Translation

374 - Association for Computation Linguistics 2021 مقالة

Recently, neural machine translation is widely used for its high translation accuracy, but it is also known to show poor performance at long sentence translation. Besides, this tendency appears prominently for low resource languages. We assume that t hese problems are caused by long sentences being few in the train data. Therefore, we propose a data augmentation method for handling long sentences. Our method is simple; we only use given parallel corpora as train data and generate long sentences by concatenating two sentences. Based on our experiments, we confirm improvements in long sentence translation by proposed data augmentation despite the simplicity. Moreover, the proposed method improves translation quality more when combined with back-translation.

sentence concatenation approach concatenation approach نهج تسلسل الجملة نهج التسلسل صناعة حمض الفوسفور

Data and Parameter Scaling Laws for Neural Machine Translation

310 - Association for Computation Linguistics 2021 مقالة

We observe that the development cross-entropy loss of supervised neural machine translation models scales like a power law with the amount of training data and the number of non-embedding parameters in the model. We discuss some practical implication s of these results, such as predicting BLEU achieved by large scale models and predicting the ROI of labeling data in low-resource language pairs.

ترجمة المصطلحات العصبية parameter scaling laws supervised neural machine القوانين المعلمة القياس آلة عصبية خاضعة للإشراف صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Counterfactual Data Augmentation for Neural Machine Translation

تكبير البيانات المضادة للترجمة الآلية العصبية

Ask ChatGPT about the research

Read More

suggested questions