Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Towards Document-Level Human MT Evaluation: On the Issues of Annotator Agreement, Effort and Misevaluation

نحو تقييم MT البشري على مستوى المستند: حول قضايا اتفاقية المعلقين، الجهد والهيسيفال

466 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

document-level human evaluation issues of annotator annotator agreement التقييم البشري على مستوى المستند قضايا المعلقين اتفاقية Annotator صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Document-level human evaluation of machine translation (MT) has been raising interest in the community. However, little is known about the issues of using document-level methodologies to assess MT quality. In this article, we compare the inter-annotator agreement (IAA) scores, the effort to assess the quality in different document-level methodologies, and the issue of misevaluation when sentences are evaluated out of context.

References used

https://aclanthology.org/

rate research

On User Interfaces for Large-Scale Document-Level Human Evaluation of Machine Translation Outputs

695 - Association for Computation Linguistics 2021 مقالة

Recent studies emphasize the need of document context in human evaluation of machine translations, but little research has been done on the impact of user interfaces on annotator productivity and the reliability of assessments. In this work, we compa re human assessment data from the last two WMT evaluation campaigns collected via two different methods for document-level evaluation. Our analysis shows that a document-centric approach to evaluation where the annotator is presented with the entire document context on a screen leads to higher quality segment and document level assessments. It improves the correlation between segment and document scores and increases inter-annotator agreement for document scores but is considerably more time consuming for annotators.

machine translation outputs translation outputs user interfaces نواتج الترجمة الآلية مخرجات الترجمة واجهات المستخدم صناعة حمض الفوسفور المزيد..

DELA Corpus - A Document-Level Corpus Annotated with Context-Related Issues

835 - Association for Computation Linguistics 2021 مقالة

Recently, the Machine Translation (MT) community has become more interested in document-level evaluation especially in light of reactions to claims of human parity'', since examining the quality at the level of the document rather than at the sentenc e level allows for the assessment of suprasentential context, providing a more reliable evaluation. This paper presents a document-level corpus annotated in English with context-aware issues that arise when translating from English into Brazilian Portuguese, namely ellipsis, gender, lexical ambiguity, number, reference, and terminology, with six different domains. The corpus can be used as a challenge test set for evaluation and as a training/testing corpus for MT as well as for deep linguistic analysis of context issues. To the best of our knowledge, this is the first corpus of its kind.

document-level corpus annotated dela corpus corpus annotated وصف مستوى المستند المشروح ديلا كوربوس corpus المشروح صناعة حمض الفوسفور المزيد..

Entity and Evidence Guided Document-Level Relation Extraction

690 - Association for Computation Linguistics 2021 مقالة

Document-level relation extraction is a challenging task, requiring reasoning over multiple sentences to predict a set of relations in a document. In this paper, we propose a novel framework E2GRE (Entity and Evidence Guided Relation Extraction) that jointly extracts relations and the underlying evidence sentences by using large pretrained language model (LM) as input encoder. First, we propose to guide the pretrained LM's attention mechanism to focus on relevant context by using attention probabilities as additional features for evidence prediction. Furthermore, instead of feeding the whole document into pretrained LMs to obtain entity representation, we concatenate document text with head entities to help LMs concentrate on parts of the document that are more related to the head entity. Our E2GRE jointly learns relation extraction and evidence prediction effectively, showing large gains on both these tasks, which we find are highly correlated.

document-level relation extraction guided relation extraction استخراج العلاقة على مستوى المستند استخراج العلاقة الموجهة استخراج العلاقة صناعة حمض الفوسفور

Why Do Document-Level Polarity Classifiers Fail?

664 - Association for Computation Linguistics 2021 مقالة

Machine learning solutions are often criticized for the lack of explanation of their successes and failures. Understanding which instances are misclassified and why is essential to improve the learning process. This work helps to fill this gap by pro posing a methodology to characterize, quantify and measure the impact of hard instances in the task of polarity classification of movie reviews. We characterize such instances into two categories: neutrality, where the text does not convey a clear polarity, and discrepancy, where the polarity of the text is the opposite of its true rating. We quantify the number of hard instances in polarity classification of movie reviews and provide empirical evidence about the need to pay attention to such problematic instances, as they are much harder to classify, for both machine and human classifiers. To the best of our knowledge, this is the first systematic analysis of the impact of hard instances in polarity detection from well-formed textual reviews.

polarity classifiers fail classifiers fail document-level polarity classifiers الفشل القطبية الفشل المصنفين الطبقات ذات مستوى المستند صناعة حمض الفوسفور المزيد..

Document-Level Text Simplification: Dataset, Criteria and Baseline

1332 - Association for Computation Linguistics 2021 مقالة

Text simplification is a valuable technique. However, current research is limited to sentence simplification. In this paper, we define and investigate a new task of document-level text simplification, which aims to simplify a document consisting of m ultiple sentences. Based on Wikipedia dumps, we first construct a large-scale dataset named D-Wikipedia and perform analysis and human evaluation on it to show that the dataset is reliable. Then, we propose a new automatic evaluation metric called D-SARI that is more suitable for the document-level simplification task. Finally, we select several representative models as baseline models for this task and perform automatic evaluation and human evaluation. We analyze the results and point out the shortcomings of the baseline models.

إزالة السموم باستخدام كبير document-level text نص مستوى المستند صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Towards Document-Level Human MT Evaluation: On the Issues of Annotator Agreement, Effort and Misevaluation

نحو تقييم MT البشري على مستوى المستند: حول قضايا اتفاقية المعلقين، الجهد والهيسيفال

Ask ChatGPT about the research

Read More

suggested questions