Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

High Frequent In-domain Words Segmentation and Forward Translation for the WMT21 Biomedical Task

تقسيم الكلمات المتكررة عالية المتكررة والترجمة الأمامية للمهمة الطبية الحيوية WMT21

622 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

high frequent in-domain frequent in-domain words in-domain words segmentation عالية متكررة في المجال الكلمات المتكررة في مجال المجال الكلمات داخل المجال تجزئة صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper reports the optimization of using the out-of-domain data in the Biomedical translation task. We firstly optimized our parallel training dataset using the BabelNet in-domain terminology words. Afterward, to increase the training set, we studied the effects of the out-of-domain data on biomedical translation tasks, and we created a mixture of in-domain and out-of-domain training sets and added more in-domain data using forward translation in the English-Spanish task. Finally, with a simple bpe optimization method, we increased the number of in-domain sub-words in our mixed training set and trained the Transformer model on the generated data. Results show improvements using our proposed method.

References used

https://aclanthology.org/

rate research

FJWU Participation for the WMT21 Biomedical Translation Task

740 - Association for Computation Linguistics 2021 مقالة

In this paper we present the FJWU's system submitted to the biomedical shared task at WMT21. We prepared state-of-the-art multilingual neural machine translation systems for three languages (i.e. German, Spanish and French) with English as target lan guage. Our NMT systems based on Transformer architecture, were trained on combination of in-domain and out-domain parallel corpora developed using Information Retrieval (IR) and domain adaptation techniques.

fjwu participation biomedical translation task biomedical shared task مشاركة FJWU. مهمة الترجمة الطبية الحيوية المهمة المشتركة الطبية الحيوية صناعة حمض الفوسفور المزيد..

Tencent AI Lab Machine Translation Systems for the WMT21 Biomedical Translation Task

1173 - Association for Computation Linguistics 2021 مقالة

This paper describes the Tencent AI Lab submission of the WMT2021 shared task on biomedical translation in eight language directions: English-German, English-French, English-Spanish and English-Russian. We utilized different Transformer architectures , pretraining and back-translation strategies to improve translation quality. Concretely, we explore mBART (Liu et al., 2020) to demonstrate the effectiveness of the pretraining strategy. Our submissions (Tencent AI Lab Machine Translation, TMT) in German/French/Spanish⇒English are ranked 1st respectively according to the official evaluation results in terms of BLEU scores.

الكلمات داخل المجال تجزئة lab machine translation tencent ai lab الترجمة الآلية المختبر Tencent AI Lab. صناعة حمض الفوسفور

HW-TSC's Submissions to the WMT21 Biomedical Translation Task

1278 - Association for Computation Linguistics 2021 مقالة

This paper describes the submission of Huawei Translation Service Center (HW-TSC) to WMT21 biomedical translation task in two language pairs: Chinese↔English and German↔English (Our registered team name is HuaweiTSC). Technical details are introduced in this paper, including model framework, data pre-processing method and model enhancement strategies. In addition, using the wmt20 OK-aligned biomedical test set, we compare and analyze system performances under different strategies. On WMT21 biomedical translation task, Our systems in English→Chinese and English→German directions get the highest BLEU scores among all submissions according to the official evaluation results.

Tencent AI Lab. huawei translation service خدمة الترجمة Huawei صناعة حمض الفوسفور

The Fujitsu DMATH Submissions for WMT21 News Translation and Biomedical Translation Tasks

611 - Association for Computation Linguistics 2021 مقالة

This paper describes the Fujitsu DMATH systems used for WMT 2021 News Translation and Biomedical Translation tasks. We focused on low-resource pairs, using a simple system. We conducted experiments on English-Hausa, Xhosa-Zulu and English-Basque, and submitted the results for Xhosa→Zulu in the News Translation Task, and English→Basque in the Biomedical Translation Task, abstract and terminology translation subtasks. Our system combines BPE dropout, sub-subword features and back-translation with a Transformer (base) model, achieving good results on the evaluation sets.

biomedical translation tasks fujitsu dmath submissions biomedical translation مهام الترجمة الطبية الحيوية fujitsu dmath التقديمات الترجمة الطبية الحيوية صناعة حمض الفوسفور المزيد..

Tencent Translation System for the WMT21 News Translation Task

753 - Association for Computation Linguistics 2021 مقالة

This paper describes Tencent Translation systems for the WMT21 shared task. We participate in the news translation task on three language pairs: Chinese-English, English-Chinese and German-English. Our systems are built on various Transformer models with novel techniques adapted from our recent research work. First, we combine different data augmentation methods including back-translation, forward-translation and right-to-left training to enlarge the training data. We also apply language coverage bias, data rejuvenation and uncertainty-based sampling approaches to select content-relevant and high-quality data from large parallel and monolingual corpora. Expect for in-domain fine-tuning, we also propose a fine-grained one model one domain'' approach to model characteristics of different news genres at fine-tuning and decoding stages. Besides, we use greed-based ensemble algorithm and transductive ensemble method to further boost our systems. Based on our success in the last WMT, we continuously employed advanced techniques such as large batch training, data selection and data filtering. Finally, our constrained Chinese-English system achieves 33.4 case-sensitive BLEU score, which is the highest among all submissions. The German-English system is ranked at second place accordingly.

tencent translation tencent translation system describes tencent translation ترجمة تينسنت نظام الترجمة تينسنت يصف تينسنت الترجمة صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

High Frequent In-domain Words Segmentation and Forward Translation for the WMT21 Biomedical Task

تقسيم الكلمات المتكررة عالية المتكررة والترجمة الأمامية للمهمة الطبية الحيوية WMT21

Ask ChatGPT about the research

Read More

suggested questions