Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Machine Translation in the Covid domain: an English-Irish case study for LoResMT 2021

ترجمة آلية في مجال Covid: دراسة حالة باللغة الإنجليزية الأيرلندية ل Loresmt 2021

1154 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

اليقظ قليلا ضبط english to irish covid الإنجليزية إلى الأيرلندية مرض فيروس كورونا صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Translation models for the specific domain of translating Covid data from English to Irish were developed for the LoResMT 2021 shared task. Domain adaptation techniques, using a Covid-adapted generic 55k corpus from the Directorate General of Translation, were applied. Fine-tuning, mixed fine-tuning and combined dataset approaches were compared with models trained on an extended in-domain dataset. As part of this study, an English-Irish dataset of Covid related data, from the Health and Education domains, was developed. The highestperforming model used a Transformer architecture trained with an extended in-domain Covid dataset. In the context of this study, we have demonstrated that extending an 8k in-domain baseline dataset by just 5k lines improved the BLEU score by 27 points.

References used

https://aclanthology.org/

rate research

English-Marathi Neural Machine Translation for LoResMT 2021

1055 - Association for Computation Linguistics 2021 مقالة

In this paper, we (team - oneNLP-IIITH) describe our Neural Machine Translation approaches for English-Marathi (both direction) for LoResMT-20211 . We experimented with transformer based Neural Machine Translation and explored the use of different li nguistic features like POS and Morph on subword unit for both English-Marathi and Marathi-English. In addition, we have also explored forward and backward translation using web-crawled monolingual data. We obtained 22.2 (overall 2 nd) and 31.3 (overall 1 st) BLEU scores for English-Marathi and Marathi-English on respectively

مرض فيروس كورونا english-marathi neural machine الإنجليزية-الماراثي الآلة العصبية صناعة حمض الفوسفور

Low Resource Multimodal Neural Machine Translation of English-Hindi in News Domain

827 - Association for Computation Linguistics 2021 مقالة

Incorporating multiple input modalities in a machine translation (MT) system is gaining popularity among MT researchers. Unlike the publicly available dataset for Multimodal Machine Translation (MMT) tasks, where the captions are short image descript ions, the news captions provide a more detailed description of the contents of the images. As a result, numerous named entities relating to specific persons, locations, etc., are found. In this paper, we acquire two monolingual news datasets reported in English and Hindi paired with the images to generate a synthetic English-Hindi parallel corpus. The parallel corpus is used to train the English-Hindi Neural Machine Translation (NMT) and an English-Hindi MMT system by incorporating the image feature paired with the corresponding parallel corpus. We also conduct a systematic analysis to evaluate the English-Hindi MT systems with 1) more synthetic data and 2) by adding back-translated data. Our finding shows improvement in terms of BLEU scores for both the NMT (+8.05) and MMT (+11.03) systems.

low resource multimodal resource multimodal neural موارد منخفضة متعددة الوسائط الموارد متعددة الوسائط العصبية صناعة حمض الفوسفور

Findings of the LoResMT 2021 Shared Task on COVID and Sign Language for Low-resource Languages

934 - Association for Computation Linguistics 2021 مقالة

We present the findings of the LoResMT 2021 shared task which focuses on machine translation (MT) of COVID-19 data for both low-resource spoken and sign languages. The organization of this task was conducted as part of the fourth workshop on technolo gies for machine translation of low resource languages (LoResMT). Parallel corpora is presented and publicly available which includes the following directions: English↔Irish, English↔Marathi, and Taiwanese Sign language↔Traditional Chinese. Training data consists of 8112, 20933 and 128608 segments, respectively. There are additional monolingual data sets for Marathi and English that consist of 21901 segments. The results presented here are based on entries from a total of eight teams. Three teams submitted systems for English↔Irish while five teams submitted systems for English↔Marathi. Unfortunately, there were no systems submissions for the Taiwanese Sign language↔Traditional Chinese task. Maximum system performance was computed using BLEU and follow as 36.0 for English--Irish, 34.6 for Irish--English, 24.2 for English--Marathi, and 31.3 for Marathi--English.

taiwanese sign language لغة علامة تايوانية صناعة حمض الفوسفور

The UCF Systems for the LoResMT 2021 Machine Translation Shared Task

679 - Association for Computation Linguistics 2021 مقالة

We present the University of Central Florida systems for the LoResMT 2021 Shared Task, participating in the English-Irish and English-Marathi translation pairs. We focused our efforts on constrained track of the task, using transfer learning and subw ord segmentation to enhance our models given small amounts of training data. Our models achieved the highest BLEU scores on the fully constrained tracks of English-Irish, Irish-English, and Marathi-English with scores of 13.5, 21.3, and 17.9 respectively

machine translation shared central florida systems ترجمة آلية تقاسمها نظم فلوريدا المركزية صناعة حمض الفوسفور

A3-108 Machine Translation System for LoResMT Shared Task @MT Summit 2021 Conference

583 - Association for Computation Linguistics 2021 مقالة

In this paper, we describe our submissions for LoResMT Shared Task @MT Summit 2021 Conference. We built statistical translation systems in each direction for English ⇐⇒ Marathi language pair. This paper outlines initial baseline experiments with vari ous tokenization schemes to train models. Using optimal tokenization scheme we create synthetic data and further train augmented dataset to create more statistical models. Also, we reorder English to match Marathi syntax to further train another set of baseline and data augmented models using various tokenization schemes. We report configuration of the submitted systems and results produced by them.

loresmt shared task LORESMT المهمة المشتركة صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Machine Translation in the Covid domain: an English-Irish case study for LoResMT 2021

ترجمة آلية في مجال Covid: دراسة حالة باللغة الإنجليزية الأيرلندية ل Loresmt 2021

Ask ChatGPT about the research

Read More

suggested questions