Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Tag Assisted Neural Machine Translation of Film Subtitles

العلامة بمساعدة آلة الجهاز العصبي ترجمات الأفلام

426 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We implemented a neural machine translation system that uses automatic sequence tagging to improve the quality of translation. Instead of operating on unannotated sentence pairs, our system uses pre-trained tagging systems to add linguistic features to source and target sentences. Our proposed neural architecture learns a combined embedding of tokens and tags in the encoder, and simultaneous token and tag prediction in the decoder. Compared to a baseline with unannotated training, this architecture increased the BLEU score of German to English film subtitle translation outputs by 1.61 points using named entity tags; however, the BLEU score decreased by 0.38 points using part-of-speech tags. This demonstrates that certain token-level tag outputs from off-the-shelf tagging systems can improve the output of neural translation systems using our combined embedding and simultaneous decoding extensions.

References used

https://aclanthology.org/

rate research

Sampling and Filtering of Neural Machine Translation Distillation Data

636 - Association for Computation Linguistics 2021 مقالة

In most of neural machine translation distillation or stealing scenarios, the highest-scoring hypothesis of the target model (teacher) is used to train a new model (student). If reference translations are also available, then better hypotheses (with respect to the references) can be oversampled and poor hypotheses either removed or undersampled. This paper explores the sampling method landscape (pruning, hypothesis oversampling and undersampling, deduplication and their combination) with English to Czech and English to German MT models using standard MT evaluation metrics. We show that careful oversampling and combination with the original data leads to better performance when compared to training only on the original or synthesized data or their direct combination.

growdsourcing اللغة الطبيعية machine translation distillation filtering of neural جهاز التقطير الترجمة تصفية العصبية صناعة حمض الفوسفور

Machine-Assisted Script Curation

509 - Association for Computation Linguistics 2021 مقالة

We describe Machine-Aided Script Curator (MASC), a system for human-machine collaborative script authoring. Scripts produced with MASC include (1) English descriptions of sub-events that comprise a larger, complex event; (2) event types for each of t hose events; (3) a record of entities expected to participate in multiple sub-events; and (4) temporal sequencing between the sub-events. MASC automates portions of the script creation process with suggestions for event types, links to Wikidata, and sub-events that may have been forgotten. We illustrate how these automations are useful to the script writer with a few case-study scripts.

machine-assisted script curation script curation machine-aided script curator آلة نصم سيدتي سيناريو سلطة أمين سيناريو بمساعدة الماكينة صناعة حمض الفوسفور المزيد..

GTCOM Neural Machine Translation Systems for WMT21

743 - Association for Computation Linguistics 2021 مقالة

This paper describes the Global Tone Communication Co., Ltd.'s submission of the WMT21 shared news translation task. We participate in six directions: English to/from Hausa, Hindi to/from Bengali and Zulu to/from Xhosa. Our submitted systems are unco nstrained and focus on multilingual translation odel, backtranslation and forward-translation. We also apply rules and language model to filter monolingual, parallel sentences and synthetic sentences.

gtcom neural machine gtcom neural GTCOM الآلة العصبية ترجمة الآلة العصبية gtcom العصبية صناعة حمض الفوسفور

Machine Translation Believability

883 - Association for Computation Linguistics 2021 مقالة

Successful Machine Translation (MT) deployment requires understanding not only the intrinsic qualities of MT output, such as fluency and adequacy, but also user perceptions. Users who do not understand the source language respond to MT output based o n their perception of the likelihood that the meaning of the MT output matches the meaning of the source text. We refer to this as believability. Output that is not believable may be off-putting to users, but believable MT output with incorrect meaning may mislead them. In this work, we study the relationship of believability to fluency and adequacy by applying traditional MT direct assessment protocols to annotate all three features on the output of neural MT systems. Quantitative analysis of these annotations shows that believability is closely related to but distinct from fluency, and initial qualitative analysis suggests that semantic features may account for the difference.

successful machine translation machine translation believability ترجمة آلية ناجحة آلة تصرف الترجمة صناعة حمض الفوسفور

Distributionally Robust Multilingual Machine Translation

620 - Association for Computation Linguistics 2021 مقالة

Multilingual neural machine translation (MNMT) learns to translate multiple language pairs with a single model, potentially improving both the accuracy and the memory-efficiency of deployed models. However, the heavy data imbalance between languages hinders the model from performing uniformly across language pairs. In this paper, we propose a new learning objective for MNMT based on distributionally robust optimization, which minimizes the worst-case expected loss over the set of language pairs. We further show how to practically optimize this objective for large translation corpora using an iterated best response scheme, which is both effective and incurs negligible additional computational cost compared to standard empirical risk minimization. We perform extensive experiments on three sets of languages from two datasets and show that our method consistently outperforms strong baseline methods in terms of average and per-language performance under both many-to-one and one-to-many translation settings.

robust multilingual machine آلة متعددة اللغات قوية صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Tag Assisted Neural Machine Translation of Film Subtitles

العلامة بمساعدة آلة الجهاز العصبي ترجمات الأفلام

Ask ChatGPT about the research

Read More

suggested questions