Do you want to publish a course? Click here

Measuring Translationese across Levels of Expertise: Are Professionals more Surprising than Students?

قياس الترجمة عبر مستويات الخبرة: هي المهنيين أكثر إثارة للدهشة من الطلاب؟

291   0   0   0.0 ( 0 )
 Publication date 2021
and research's language is English
 Created by Shamra Editor




Ask ChatGPT about the research

The present paper deals with a computational analysis of translationese in professional and student English-to-German translations belonging to different registers. Building upon an information-theoretical approach, we test translation conformity to source and target language in terms of a neural language model's perplexity over Part of Speech (PoS) sequences. Our primary focus is on register diversification vs. convergence, reflected in the use of constructions eliciting a higher vs. lower perplexity score. Our results show that, against our expectations, professional translations elicit higher perplexity scores from a target language model than students' translations. An analysis of the distribution of PoS patterns across registers shows that this apparent paradox is the effect of higher stylistic diversification and register sensitivity in professional translations. Our results contribute to the understanding of human translationese and shed light on the variation in texts generated by different translators, which is valuable for translation studies, multilingual language processing, and machine translation.



References used
https://aclanthology.org/
rate research

Read More

In the field of natural language processing, ensembles are broadly known to be effective in improving performance. This paper analyzes how ensemble of neural machine translation (NMT) models affect performance improvement by designing various experim ental setups (i.e., intra-, inter-ensemble, and non-convergence ensemble). To an in-depth examination, we analyze each ensemble method with respect to several aspects such as different attention models and vocab strategies. Experimental results show that ensembling is not always resulting in performance increases and give noteworthy negative findings.
This study tries to assess the potential anti-atherosclerotic role of adiponectin. It includes 54 patients at Al-Assad University Hospital in Lattakia, who are candidates for catheterization, and control group of 25 individuals. Serum adiponectin levels have been measured in both groups, and levels of hs-CRP have been also measured in both groups and compared with adiponectin levels. Catheterization has been done to the patients group and the results of angiography are categorized to mild, moderate, and severe coronary artery disease (CAD) according to the SYNTAX SCORE The study concludes that serum adiponectin levels are higher in patients with mild CAD(23.86 μg/ml) compared to patients with moderate to severe CAD (13.62 μg/ml);(p=0.001<0.05). The average serum adiponectin levels in control group was (17.1μg/ml). There was no statistically significant of the relation between the concentration of hs-CRP and the concentration of adiponectin .However, there was a statistically significant difference between the average concentration of hs-CRP in the patients group (7.1 mg/l) and the control group (2.7 mg/l); (P=0.003<0.05).
Word embeddings are powerful representations that form the foundation of many natural language processing architectures, both in English and in other languages. To gain further insight into word embeddings, we explore their stability (e.g., overlap b etween the nearest neighbors of a word in different embedding spaces) in diverse languages. We discuss linguistic properties that are related to stability, drawing out insights about correlations with affixing, language gender systems, and other features. This has implications for embedding use, particularly in research that uses them to study language trends.
With the rise of research on toxic comment classification, more and more annotated datasets have been released. The wide variety of the task (different languages, different labeling processes and schemes) has led to a large amount of heterogeneous da tasets that can be used for training and testing very specific settings. Despite recent efforts to create web pages that provide an overview, most publications still use only a single dataset. They are not stored in one central database, they come in many different data formats and it is difficult to interpret their class labels and how to reuse these labels in other projects. To overcome these issues, we present a collection of more than thirty datasets in the form of a software tool that automatizes downloading and processing of the data and presents them in a unified data format that also offers a mapping of compatible class labels. Another advantage of that tool is that it gives an overview of properties of available datasets, such as different languages, platforms, and class labels to make it easier to select suitable training and test data.
Hepatocellular carcinoma (HCC) is a common malignancy and the leading cause of death worldwide, due to late detection and high recurrence rates. Osteopontin (OPN) has various functions, including prevention of apoptosis and modulation of angiogene sis which lead to tumor formation and progression, although the exact mechanisms for the development of cancer are still unknown.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا