New community

Subscribe to the gold package and get unlimited access to Shamra Academy

NRC-CNRC Systems for Upper Sorbian-German and Lower Sorbian-German Machine Translation 2021

NRC-CNRC أنظمة للجهاز الصربي الألماني والألماني والألماني والألماني 2021

286 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

lower sorbian-german machine sorbian-german machine translation sorbian and german ماكينة السوربيانية السفلى الألمانية ترجمة Sorbian-الألمانية Sorbian والألمانية صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We describe our neural machine translation systems for the 2021 shared task on Unsupervised and Very Low Resource Supervised MT, translating between Upper Sorbian and German (low-resource) and between Lower Sorbian and German (unsupervised). The systems incorporated data filtering, backtranslation, BPE-dropout, ensembling, and transfer learning from high(er)-resource languages. As measured by automatic metrics, our systems showed strong performance, consistently placing first or tied for first across most metrics and translation directions.

References used

https://aclanthology.org/

rate research

Linguistic Evaluation for the 2021 State-of-the-art Machine Translation Systems for German to English and English to German

463 - Association for Computation Linguistics 2021 مقالة

We are using a semi-automated test suite in order to provide a fine-grained linguistic evaluation for state-of-the-art machine translation systems. The evaluation includes 18 German to English and 18 English to German systems, submitted to the Transl ation Shared Task of the 2021 Conference on Machine Translation. Our submission adds up to the submissions of the previous years by creating and applying a wide-range test suite for English to German as a new language pair. The fine-grained evaluation allows spotting significant differences between systems that cannot be distinguished by the direct assessment of the human evaluation campaign. We find that most of the systems achieve good accuracies in the majority of linguistic phenomena but there are few phenomena with lower accuracy, such as the idioms, the modal pluperfect and the German resultative predicates. Two systems have significantly better test suite accuracy in macro-average in every language direction, Online-W and Facebook-AI for German to English and VolcTrans and Online-W for English to German. The systems show a steady improvement as compared to previous years.

تحسين بقوة صناعة حمض الفوسفور

The UCF Systems for the LoResMT 2021 Machine Translation Shared Task

305 - Association for Computation Linguistics 2021 مقالة

We present the University of Central Florida systems for the LoResMT 2021 Shared Task, participating in the English-Irish and English-Marathi translation pairs. We focused our efforts on constrained track of the task, using transfer learning and subw ord segmentation to enhance our models given small amounts of training data. Our models achieved the highest BLEU scores on the fully constrained tracks of English-Irish, Irish-English, and Marathi-English with scores of 13.5, 21.3, and 17.9 respectively

machine translation shared central florida systems ترجمة آلية تقاسمها نظم فلوريدا المركزية صناعة حمض الفوسفور

N-gram and Neural Models for Uralic Language Identification: NRC at VarDial 2021

384 - Association for Computation Linguistics 2021 مقالة

We describe the systems developed by the National Research Council Canada for the Uralic language identification shared task at the 2021 VarDial evaluation campaign. We evaluated two different approaches to this task: a probabilistic classifier explo iting only character 5-grams as features, and a character-based neural network pre-trained through self-supervision, then fine-tuned on the language identification task. The former method turned out to perform better, which casts doubt on the usefulness of deep learning methods for language identification, where they have yet to convincingly and consistently outperform simpler and less costly classification algorithms exploiting n-gram features.

uralic language identification national research council research council canada تحديد اللغة الأورالية المجلس الوطني للبحوث مجلس البحوث كندا صناعة حمض الفوسفور المزيد..

Exploring German Multi-Level Text Simplification

299 - Association for Computation Linguistics 2021 مقالة

We report on experiments in automatic text simplification (ATS) for German with multiple simplification levels along the Common European Framework of Reference for Languages (CEFR), simplifying standard German into levels A1, A2 and B1. For that purp ose, we investigate the use of source labels and pretraining on standard German, allowing us to simplify standard language to a specific CEFR level. We show that these approaches are especially effective in low-resource scenarios, where we are able to outperform a standard transformer baseline. Moreover, we introduce copy labels, which we show can help the model make a distinction between sentences that require further modifications and sentences that can be copied as-is.

exploring german multi-level german multi-level text multi-level text simplification استكشاف الألماني متعدد المستويات النص الألماني متعدد المستويات تبسيط النص متعدد المستويات صناعة حمض الفوسفور المزيد..

THE IWSLT 2021 BUT SPEECH TRANSLATION SYSTEMS

480 - Association for Computation Linguistics 2021 مقالة

The paper describes BUT's English to German offline speech translation (ST) systems developed for IWSLT2021. They are based on jointly trained Automatic Speech Recognition-Machine Translation models. Their performances is evaluated on MustC-Common te st set. In this work, we study their efficiency from the perspective of having a large amount of separate ASR training data and MT training data, and a smaller amount of speech-translation training data. Large amounts of ASR and MT training data are utilized for pre-training the ASR and MT models. Speech-translation data is used to jointly optimize ASR-MT models by defining an end-to-end differentiable path from speech to translations. For this purpose, we use the internal continuous representations from the ASR-decoder as the input to MT module. We show that speech translation can be further improved by training the ASR-decoder jointly with the MT-module using large amount of text-only MT training data. We also show significant improvements by training an ASR module capable of generating punctuated text, rather than leaving the punctuation task to the MT module.

speech translation systems english to german نظم ترجمة الكلام الانجليزية الى الالمانية صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

NRC-CNRC Systems for Upper Sorbian-German and Lower Sorbian-German Machine Translation 2021

NRC-CNRC أنظمة للجهاز الصربي الألماني والألماني والألماني والألماني 2021

Ask ChatGPT about the research

Read More

suggested questions