Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Dealing with the Paradox of Quality Estimation

التعامل مع مفارقة تقدير الجودة

1327 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

المعجم التعريفي pseudo-qe dataset صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In quality estimation (QE), the quality of translation can be predicted by referencing the source sentence and the machine translation (MT) output without access to the reference sentence. However, there exists a paradox in that constructing a dataset for creating a QE model requires non-trivial human labor and time, and it may even requires additional effort compared to the cost of constructing a parallel corpus. In this study, to address this paradox and utilize the various applications of QE, even in low-resource languages (LRLs), we propose a method for automatically constructing a pseudo-QE dataset without using human labor. We perform a comparative analysis on the pseudo-QE dataset using multilingual pre-trained language models. As we generate the pseudo dataset, we conduct experiments using various external machine translators as test sets to verify the accuracy of the results objectively. Also, the experimental results show that multilingual BART demonstrates the best performance, and we confirm the applicability of QE in LRLs using pseudo-QE dataset construction methods.

References used

https://aclanthology.org/

rate research

Quality Estimation Using Dual Encoders with Transfer Learning

1512 - Association for Computation Linguistics 2021 مقالة

This paper describes POSTECH's quality estimation systems submitted to Task 2 of the WMT 2021 quality estimation shared task: Word and Sentence-Level Post-editing Effort. We notice that it is possible to improve the stability of the latest quality es timation models that have only one encoder based on the self-attention mechanism to simultaneously process both of the two input data, a source sequence and its machine translation, in that such models have neglected to take advantage of pre-trained monolingual representations, which are generally accepted as reliable representations for various natural language processing tasks. Therefore, our model uses two pre-trained monolingual encoders and then exchanges the information of two encoded representations through two additional cross attention networks. According to the official leaderboard, our systems outperform the baseline systems in terms of the Matthews correlation coefficient for machine translations' word-level quality estimation and in terms of the Pearson's correlation coefficient for sentence-level quality estimation by 0.4126 and 0.5497 respectively.

فرقة صقل الناعم postech quality estimation تقدير جودة البريد صناعة حمض الفوسفور

Neural Quality Estimation with Multiple Hypotheses for Grammatical Error Correction

729 - Association for Computation Linguistics 2021 مقالة

Grammatical Error Correction (GEC) aims to correct writing errors and help language learners improve their writing skills. However, existing GEC models tend to produce spurious corrections or fail to detect lots of errors. The quality estimation mode l is necessary to ensure learners get accurate GEC results and avoid misleading from poorly corrected sentences. Well-trained GEC models can generate several high-quality hypotheses through decoding, such as beam search, which provide valuable GEC evidence and can be used to evaluate GEC quality. However, existing models neglect the possible GEC evidence from different hypotheses. This paper presents the Neural Verification Network (VERNet) for GEC quality estimation with multiple hypotheses. VERNet establishes interactions among hypotheses with a reasoning graph and conducts two kinds of attention mechanisms to propagate GEC evidence to verify the quality of generated hypotheses. Our experiments on four GEC datasets show that VERNet achieves state-of-the-art grammatical error detection performance, achieves the best quality estimation results, and significantly improves GEC performance by reranking hypotheses. All data and source codes are available at https://github.com/thunlp/VERNet.

نموذج مطابقة النص gec جيك صناعة حمض الفوسفور

Findings of the WMT 2021 Shared Task on Quality Estimation

980 - Association for Computation Linguistics 2021 مقالة

We report the results of the WMT 2021 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word and sentence levels. This edition focused on two main novel additio ns: (i) prediction for unseen languages, i.e. zero-shot settings, and (ii) prediction of sentences with catastrophic errors. In addition, new data was released for a number of languages, especially post-edited data. Participating teams from 19 institutions submitted altogether 1263 systems to different task variants and language pairs.

WMT المهمة الطبية الحيوية task on quality المهمة على الجودة صناعة حمض الفوسفور

The approach of the ancients in dealing with dialects and the position of modernists

3350 - Tishreen University 2018 ورقة بحثية

The subject of dialects in Arabic grammar is a subject of confusion and confusion at the time of Arabic grammar when they used the term "dialect" and "language" in their expression of the dialectic differences between the tribes. The modernists, mo reover, did not have independent works, which were specialized in studying each dialect separately. They identified the clear tribes that could be adopted in their language, and left the other tribes under the pretext of leaving the linguistic level. Therefore, my dependence on this research will focus on two issues: The ancients in their dealings with dialects, relying on what was in the book properties of Ibn-taking, book Sibawayh, book Alsahabay in the jurisprudence of the language of IbnFaris, a book brief cooler and other books on this subject, and the second: modern attitude and the most prominent criticism of the approach to the ancients.

language اللغة اللهجة القدماء المحدثون Ialect ancients modernists المزيد..

Self-Supervised Quality Estimation for Machine Translation

1122 - Association for Computation Linguistics 2021 مقالة

Quality estimation (QE) of machine translation (MT) aims to evaluate the quality of machine-translated sentences without references and is important in practical applications of MT. Training QE models require massive parallel data with hand-crafted q uality annotations, which are time-consuming and labor-intensive to obtain. To address the issue of the absence of annotated training data, previous studies attempt to develop unsupervised QE methods. However, very few of them can be applied to both sentence- and word-level QE tasks, and they may suffer from noises in the synthetic data. To reduce the negative impact of noises, we propose a self-supervised method for both sentence- and word-level QE, which performs quality estimation by recovering the masked target words. Experimental results show that our method outperforms previous unsupervised methods on several QE tasks in different language pairs and domains.

موقف تضمين self-supervised quality estimation تقدير الجودة الخاضعة للإشراف صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Dealing with the Paradox of Quality Estimation

التعامل مع مفارقة تقدير الجودة

Ask ChatGPT about the research

Read More

suggested questions