Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation

تحسين القدرة المعجمية من نماذج اللغة المحددة مسبقا للترجمة الآلية العصبية غير المعينة

504 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

الفلتره unsupervised neural machine ability of pretrained الآلة العصبية غير المنشأة القدرة على الاحترام صناعة حمض الفوسفور

visit our facebook page

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Successful methods for unsupervised neural machine translation (UNMT) employ cross-lingual pretraining via self-supervision, often in the form of a masked language modeling or a sequence generation task, which requires the model to align the lexical- and high-level representations of the two languages. While cross-lingual pretraining works for similar languages with abundant corpora, it performs poorly in low-resource and distant languages. Previous research has shown that this is because the representations are not sufficiently aligned. In this paper, we enhance the bilingual masked language model pretraining with lexical-level information by using type-level cross-lingual subword embeddings. Empirical results demonstrate improved performance both on UNMT (up to 4.5 BLEU) and bilingual lexicon induction using our method compared to a UNMT baseline.

References used

https://aclanthology.org/

rate research

Unsupervised Paraphrasing with Pretrained Language Models

471 - Association for Computation Linguistics 2021 مقالة

Paraphrase generation has benefited extensively from recent progress in the designing of training objectives and model architectures. However, previous explorations have largely focused on supervised methods, which require a large amount of labeled d ata that is costly to collect. To address this drawback, we adopt a transfer learning approach and propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking (DB). To enforce a surface form dissimilar from the input, whenever the language model emits a token contained in the source sequence, DB prevents the model from outputting the subsequent source token for the next generation step. We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair (QQP) and the ParaNMT datasets and is robust to domain shift between the two datasets of distinct distributions. We also demonstrate that our model transfers to paraphrasing in other languages without any additional finetuning.

مكافأة تقليد مختلفة paraphrasing with pretrained إعادة صياغة مع الاحاد صناعة حمض الفوسفور

Measuring and Improving Consistency in Pretrained Language Models

445 - Association for Computation Linguistics 2021 مقالة

Abstract Consistency of a model---that is, the invariance of its behavior under meaning-preserving alternations in its input---is a highly desirable property in natural language processing. In this paper we study the question: Are Pretrained Language Models (PLMs) consistent with respect to factual knowledge? To this end, we create ParaRel?, a high-quality resource of cloze-style query English paraphrases. It contains a total of 328 paraphrases for 38 relations. Using ParaRel?, we show that the consistency of all PLMs we experiment with is poor--- though with high variance between relations. Our analysis of the representational spaces of PLMs suggests that they have a poor structure and are currently not suitable for representing knowledge robustly. Finally, we propose a method for improving model consistency and experimentally demonstrate its effectiveness.1

pretrained language models pretrained language نماذج اللغة المحددة مسبقا اللغة المحددة صناعة حمض الفوسفور

The Highs and Lows of Simple Lexical Domain Adaptation Approaches for Neural Machine Translation

383 - Association for Computation Linguistics 2021 مقالة

Machine translation systems are vulnerable to domain mismatch, especially in a low-resource scenario. Out-of-domain translations are often of poor quality and prone to hallucinations, due to exposure bias and the decoder acting as a language model. W e adopt two approaches to alleviate this problem: lexical shortlisting restricted by IBM statistical alignments, and hypothesis reranking based on similarity. The methods are computationally cheap and show success on low-resource out-of-domain test sets. However, the methods lose advantage when there is sufficient data or too great domain mismatch. This is due to both the IBM model losing its advantage over the implicitly learned neural alignment, and issues with subword segmentation of unseen words.

highs and lows lows of simple simple lexical domain مرتفعات ومنخفظات أدنى مستوى بسيط المجال المعجمي بسيط صناعة حمض الفوسفور المزيد..

Discourse Probing of Pretrained Language Models

448 - Association for Computation Linguistics 2021 مقالة

Existing work on probing of pretrained language models (LMs) has predominantly focused on sentence-level syntactic tasks. In this paper, we introduce document-level discourse probing to evaluate the ability of pretrained LMs to capture document-level relations. We experiment with 7 pretrained LMs, 4 languages, and 7 discourse probing tasks, and find BART to be overall the best model at capturing discourse --- but only in its encoder, with BERT performing surprisingly well as the baseline model. Across the different models, there are substantial differences in which layers best capture discourse information, and large disparities between models.

تثق صناعة حمض الفوسفور

Encouraging Lexical Translation Consistency for Document-Level Neural Machine Translation

420 - Association for Computation Linguistics 2021 مقالة

Recently a number of approaches have been proposed to improve translation performance for document-level neural machine translation (NMT). However, few are focusing on the subject of lexical translation consistency. In this paper we apply one transla tion per discourse'' in NMT, and aim to encourage lexical translation consistency for document-level NMT. This is done by first obtaining a word link for each source word in a document, which tells the positions where the source word appears. Then we encourage the translation of those words within a link to be consistent in two ways. On the one hand, when encoding sentences within a document we properly share context information of those words. On the other hand, we propose an auxiliary loss function to better constrain that their translation should be consistent. Experimental results on Chinese↔English and English→French translation tasks show that our approach not only achieves state-of-the-art performance in BLEU scores, but also greatly improves lexical consistency in translation.

متعدد رئيس الانتباه مؤخرا صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation

تحسين القدرة المعجمية من نماذج اللغة المحددة مسبقا للترجمة الآلية العصبية غير المعينة

Ask ChatGPT about the research

Read More

suggested questions