Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Visual Cues and Error Correction for Translation Robustness

العظة المرئية وتصحيح الأخطاء لترجمة الترجمة

552 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Neural Machine Translation models are sensitive to noise in the input texts, such as misspelled words and ungrammatical constructions. Existing robustness techniques generally fail when faced with unseen types of noise and their performance degrades on clean texts. In this paper, we focus on three types of realistic noise that are commonly generated by humans and introduce the idea of visual context to improve translation robustness for noisy texts. In addition, we describe a novel error correction training regime that can be used as an auxiliary task to further improve translation robustness. Experiments on English-French and English-German translation show that both multimodal and error correction components improve model robustness to noisy texts, while still retaining translation quality on clean texts.

References used

https://aclanthology.org/

rate research

Hierarchical Character Tagger for Short Text Spelling Error Correction

731 - Association for Computation Linguistics 2021 مقالة

State-of-the-art approaches to spelling error correction problem include Transformer-based Seq2Seq models, which require large training sets and suffer from slow inference time; and sequence labeling models based on Transformer encoders like BERT, wh ich involve token-level label space and therefore a large pre-defined vocabulary dictionary. In this paper we present a Hierarchical Character Tagger model, or HCTagger, for short text spelling error correction. We use a pre-trained language model at the character level as a text encoder, and then predict character-level edits to transform the original text into its error-free form with a much smaller label space. For decoding, we propose a hierarchical multi-task approach to alleviate the issue of long-tail label distribution without introducing extra model parameters. Experiments on two public misspelling correction datasets demonstrate that HCTagger is an accurate and much faster approach than many existing models.

spelling error correction text spelling error hierarchical character tagger تصحيح الأخطاء الإملائي خطأ تهجئة النص الطابع الهرمي Tagger. صناعة حمض الفوسفور المزيد..

Comparison of Grammatical Error Correction Using Back-Translation Models

1158 - Association for Computation Linguistics 2021 مقالة

Grammatical error correction (GEC) suffers from a lack of sufficient parallel data. Studies on GEC have proposed several methods to generate pseudo data, which comprise pairs of grammatical and artificially produced ungrammatical sentences. Currently , a mainstream approach to generate pseudo data is back-translation (BT). Most previous studies using BT have employed the same architecture for both the GEC and BT models. However, GEC models have different correction tendencies depending on the architecture of their models. Thus, in this study, we compare the correction tendencies of GEC models trained on pseudo data generated by three BT models with different architectures, namely, Transformer, CNN, and LSTM. The results confirm that the correction tendencies for each error type are different for every BT model. In addition, we investigate the correction tendencies when using a combination of pseudo data generated by different BT models. As a result, we find that the combination of different BT models improves or interpolates the performance of each error type compared with using a single BT model with different seeds.

grammatical error correction grammatical error تصحيح الأخطاء النحوية خطأ نحوي صناعة حمض الفوسفور

Personality Predictive Lexical Cues and Their Correlations

571 - Association for Computation Linguistics 2021 مقالة

In recent years, a number of studies have used linear models for personality prediction based on text. In this paper, we empirically analyze and compare the lexical signals captured in such models. We identify lexical cues for each dimension of the M BTI personality scheme in several different ways, considering different datasets, feature sets, and learning algorithms. We conduct a series of correlation analyses between the resulting MBTI data and explore their connection to other signals, such as for Big-5 traits, emotion, sentiment, age, and gender. The analysis shows intriguing correlation patterns between different personality dimensions and other traits, and also provides evidence for the robustness of the data.

predictive lexical cues personality predictive lexical predictive lexical العظة المعجمية التنبؤية شخصية التنبؤية المعجمية تعميم التنبؤ صناعة حمض الفوسفور المزيد..

Grammatical Error Correction with Contrastive Learning in Low Error Density Domains

1142 - Association for Computation Linguistics 2021 مقالة

Although grammatical error correction (GEC) has achieved good performance on texts written by learners of English as a second language, performance on low error density domains where texts are written by English speakers of varying levels of proficie ncy can still be improved. In this paper, we propose a contrastive learning approach to encourage the GEC model to assign a higher probability to a correct sentence while reducing the probability of incorrect sentences that the model tends to generate, so as to improve the accuracy of the model. Experimental results show that our approach significantly improves the performance of GEC models in low error density domains, when evaluated on the benchmark CWEB dataset.

low error density error density domains كثافة خطأ منخفضة خطأ في مجال كثافة الخطأ صناعة حمض الفوسفور

Is This Translation Error Critical?: Classification-Based Human and Automatic Machine Translation Evaluation Focusing on Critical Errors

751 - Association for Computation Linguistics 2021 مقالة

This paper discusses a classification-based approach to machine translation evaluation, as opposed to a common regression-based approach in the WMT Metrics task. Recent machine translation usually works well but sometimes makes critical errors due to just a few wrong word choices. Our classification-based approach focuses on such errors using several error type labels, for practical machine translation evaluation in an age of neural machine translation. We made additional annotations on the WMT 2015-2017 Metrics datasets with fluency and adequacy labels to distinguish different types of translation errors from syntactic and semantic viewpoints. We present our human evaluation criteria for the corpus development and automatic evaluation experiments using the corpus. The human evaluation corpus will be publicly available upon publication.

translation evaluation focusing تقييم التركيز التركيز صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Visual Cues and Error Correction for Translation Robustness

العظة المرئية وتصحيح الأخطاء لترجمة الترجمة

Ask ChatGPT about the research

Read More

suggested questions