An Exploratory Analysis of Multilingual Word-Level Quality Estimation with Cross-Lingual Transformers

85 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Tharindu Ranasinghe Mr

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Tharindu Ranasinghe - Constantin Orasan - Ruslan Mitkov

الحساب واللغة الذكاء الاصطناعي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Most studies on word-level Quality Estimation (QE) of machine translation focus on language-specific models. The obvious disadvantages of these approaches are the need for labelled data for each language pair and the high cost required to maintain several language-specific models. To overcome these problems, we explore different approaches to multilingual, word-level QE. We show that these QE models perform on par with the current language-specific models. In the cases of zero-shot and few-shot QE, we demonstrate that it is possible to accurately predict word-level quality for any given new language pair from models trained on other language pairs. Our findings suggest that the word-level QE models based on powerful pre-trained transformers that we propose in this paper generalise well across languages, making them more useful in real-world scenarios.

قيم البحث

72 - Hansi Hettiarachchi , Tharindu Ranasinghe 2021

Identifying whether a word carries the same meaning or different meaning in two contexts is an important research area in natural language processing which plays a significant role in many applications such as question answering, document summarisati on, information retrieval and information extraction. Most of the previous work in this area rely on language-specific resources making it difficult to generalise across languages. Considering this limitation, our approach to SemEval-2021 Task 2 is based only on pretrained transformer models and does not use any language-specific processing and resources. Despite that, our best model achieves 0.90 accuracy for English-English subtask which is very compatible compared to the best result of the subtask; 0.93 accuracy. Our approach also achieves satisfactory results in other monolingual and cross-lingual language pairs as well.

الحساب واللغة الذكاء الاصطناعي التعلم الآلي

Levenshtein Training for Word-level Quality Estimation

170 - Shuoyang Ding , Marcin Junczys-Dowmunt , Matt Post 2021

We propose a novel scheme to use the Levenshtein Transformer to perform the task of word-level quality estimation. A Levenshtein Transformer is a natural fit for this task: trained to perform decoding in an iterative manner, a Levenshtein Transformer can learn to post-edit without explicit supervision. To further minimize the mismatch between the translation task and the word-level QE task, we propose a two-stage transfer learning procedure on both augmented data and human post-editing data. We also propose heuristics to construct reference labels that are compatible with subword-level finetuning and inference. Results on WMT 2020 QE shared task dataset show that our proposed method has superior data efficiency under the data-constrained setting and competitive performance under the unconstrained setting.

الحساب واللغة

Zero-Shot Translation Quality Estimation with Explicit Cross-Lingual Patterns

70 - Lei Zhou , Liang Ding , Koichi Takeda 2020

This paper describes our submission of the WMT 2020 Shared Task on Sentence Level Direct Assessment, Quality Estimation (QE). In this study, we empirically reveal the textit{mismatching issue} when directly adopting BERTScore to QE. Specifically, the re exist lots of mismatching errors between the source sentence and translated candidate sentence with token pairwise similarity. In response to this issue, we propose to expose explicit cross-lingual patterns, textit{e.g.} word alignments and generation score, to our proposed zero-shot models. Experiments show that our proposed QE model with explicit cross-lingual patterns could alleviate the mismatching issue, thereby improving the performance. Encouragingly, our zero-shot QE method could achieve comparable performance with supervised QE method, and even outperforms the supervised counterpart on 2 out of 6 directions. We expect our work could shed light on the zero-shot QE model improvement.

الحساب واللغة

Massively Multilingual Document Alignment with Cross-lingual Sentence-Movers Distance

303 - Ahmed El-Kishky , Francisco Guzman 2020

Document alignment aims to identify pairs of documents in two distinct languages that are of comparable content or translations of each other. Such aligned data can be used for a variety of NLP tasks from training cross-lingual representations to min ing parallel data for machine translation. In this paper we develop an unsupervised scoring function that leverages cross-lingual sentence embeddings to compute the semantic distance between documents in different languages. These semantic distances are then used to guide a document alignment algorithm to properly pair cross-lingual web documents across a variety of low, mid, and high-resource language pairs. Recognizing that our proposed scoring function and other state of the art methods are computationally intractable for long web documents, we utilize a more tractable greedy algorithm that performs comparably. We experimentally demonstrate that our distance metric performs better alignment than current baselines outperforming them by 7% on high-resource language pairs, 15% on mid-resource language pairs, and 22% on low-resource language pairs.

الحساب واللغة استرجاع المعلومات التعلم الآلي

From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers

86 - Anne Lauscher , Vinit Ravishankar , Ivan Vulic 2020

Massively multilingual transformers pretrained with language modeling objectives (e.g., mBERT, XLM-R) have become a de facto default transfer paradigm for zero-shot cross-lingual transfer in NLP, offering unmatched transfer performance. Current downs tream evaluations, however, verify their efficacy predominantly in transfer settings involving languages with sufficient amounts of pretraining data, and with lexically and typologically close languages. In this work, we analyze their limitations and show that cross-lingual transfer via massively multilingual transformers, much like transfer via cross-lingual word embeddings, is substantially less effective in resource-lean scenarios and for distant languages. Our experiments, encompassing three lower-level tasks (POS tagging, dependency parsing, NER), as well as two high-level semantic tasks (NLI, QA), empirically correlate transfer performance with linguistic similarity between the source and target languages, but also with the size of pretraining corpora of target languages. We also demonstrate a surprising effectiveness of inexpensive few-shot transfer (i.e., fine-tuning on a few target-language instances after fine-tuning in the source) across the board. This suggests that additional research efforts should be invested to reach beyond the limiting zero-shot conditions.

الحساب واللغة