Do you want to publish a course? Click here

Using CollGram to Compare Formulaic Language in Human and Machine Translation

باستخدام Collgram لمقارنة اللغة الصيغة في الترجمة الإنسانية والآلية

275   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

A comparison of formulaic sequences in human and neural machine translation of quality newspaper articles shows that neural machine translations contain less lower-frequency, but strongly-associated formulaic sequences (FSs), and more high-frequency FSs. These observations can be related to the differences between second language learners of various levels and between translated and untranslated texts. The comparison between the neural machine translation systems indicates that some systems produce more FSs of both types than other systems.

References used
https://aclanthology.org/
rate research

Read More

This paper discusses a classification-based approach to machine translation evaluation, as opposed to a common regression-based approach in the WMT Metrics task. Recent machine translation usually works well but sometimes makes critical errors due to just a few wrong word choices. Our classification-based approach focuses on such errors using several error type labels, for practical machine translation evaluation in an age of neural machine translation. We made additional annotations on the WMT 2015-2017 Metrics datasets with fluency and adequacy labels to distinguish different types of translation errors from syntactic and semantic viewpoints. We present our human evaluation criteria for the corpus development and automatic evaluation experiments using the corpus. The human evaluation corpus will be publicly available upon publication.
In this paper we introduce a vision towards establishing the Malta National Language Technology Platform; an ongoing effort that aims to provide a basis for enhancing Malta's official languages, namely Maltese and English, using Machine Translation. This will contribute towards the current niche of Language Technology support for the Maltese low-resource language, across multiple computational linguistics fields, such as speech processing, machine translation, text analysis, and multi-modal resources. The end goals are to remove language barriers, increase accessibility, foster cross-border services, and most importantly to facilitate the preservation of the Maltese language.
Resin based composite considers substantially as a restorative material in clinic. Dentists concern in placement resin composite restorations with Long term survival. Materials and Methods: 20 Non carious human third molars, class I cavity was pr epared in every molar, restored with methacrylate resin composite Tetric EvoCeram. The restored teeth were subjected to thermocycles. Specimen was randomly divided into two groups : the first group 10 restorations, were repaired by Cojet® (3M ESPE, Germany) and restored by Filtek™ P90, the second group 10 restorations also repaired by Cojet ® but restored by Filtek™ Z350 XT. After that all teeth were subjected to thermocycles again, later immersed in 0.5% basic fuchsin solution for 1 s. The teeth were transversely sectioned and observed under a stereomicroscope. The degree of dye penetration was recorded and analyzed. Results: No statistically significant differences in microleakage were observed between two groups either with low shrinkage Silorane resin composite or methacrylate resin composite
In this paper, we describe our submissions for the Similar Language Translation Shared Task 2021. We built 3 systems in each direction for the Tamil ⇐⇒ Telugu language pair. This paper outlines experiments with various tokenization schemes to train statistical models. We also report the configuration of the submitted systems and results produced by them.
In this work, two Neural Machine Translation (NMT) systems have been developed and evaluated as part of the bidirectional Tamil-Telugu similar languages translation subtask in WMT21. The OpenNMT-py toolkit has been used to create quick prototypes of the systems, following which models have been trained on the training datasets containing the parallel corpus and finally the models have been evaluated on the dev datasets provided as part of the task. Both the systems have been trained on a DGX station with 4 -V100 GPUs. The first NMT system in this work is a Transformer based 6 layer encoder-decoder model, trained for 100000 training steps, whose configuration is similar to the one provided by OpenNMT-py and this is used to create a model for bidirectional translation. The second NMT system contains two unidirectional translation models with the same configuration as the first system, with the addition of utilizing Byte Pair Encoding (BPE) for subword tokenization through the pre-trained MultiBPEmb model. Based on the dev dataset evaluation metrics for both the systems, the first system i.e. the vanilla Transformer model has been submitted as the Primary system. Since there were no improvements in the metrics during training of the second system with BPE, it has been submitted as a contrastive system.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا