Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

The Use of Corpora in an Interdisciplinary Approach to Localization

استخدام Corpora في نهج متعدد التخصصات للتعرية

631 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

interdisciplinary approach descriptive translation studies translation studies نهج متعدد التخصصات دراسات الترجمة الوصفية دراسات الترجمة صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

دراسات الترجمة وبشكل أكثر تحديدا، هي دراسات الترجمة الوصفية الفرعية [هولمز 1988/2000]، وفقا للعديد من العلماء [جامبير، 2009؛ Nenopoulou، 2007؛ Munday، 2001/2008؛ هيرميران، 1999؛ Snell-Hornby et al.، 1994 E.T.C]، وهو مجال دراسي متعدد التخصصات للغاية. الهدف من الورقة الحالية هو وصف دور كورسيا PolySemiotic في دراسة توطين موقع الجامعة من منظور متعدد التخصصات. وبشكل أكثر تحديدا، تقدم الورقة نظرة عامة على بحث مستنقع مستمر حول تكوين هوية مواقع الويب الجامعية اليونانية على الويب، مع التركيز على المنهجية المعتمدة بالإشارة إلى تجميع كوربورا بناء على الأدوات والمفاهيم المنهجية من مختلف المجالات مثل دراسات الترجمة مثل دراسات الترجمة والطبيون الاجتماعية والدراسات الثقافية وتحليل الخطاب النقدي والتسويق. إن كائنات التحليل المقارن هي المواقع الجامعية اليونانية والفرنسية الأصلية والمترجمة (إلى الإنجليزية) بالإضافة إلى إصدارات موقع الجامعة البريطانية والأمريكية الأصلية. لقد أظهرت نتائج البحث حتى الآن أن Corpora PolySemiotic يمكن أن تكون أداة قيمة ليس فقط من الكمية فقط ولكن أيضا من التحليل النوعي لتوطين الموقع لكل من العلماء ومهنيي الترجمة العاملين مع الأنواع متعددة الوسائط.

Translation Studies and more specifically, its subfield Descriptive Translation Studies [Holmes 1988/2000] is, according to many scholars [Gambier, 2009; Nenopoulou, 2007; Munday, 2001/2008; Hermans, 1999; Snell-Hornby et al., 1994 e.t.c], a highly interdisciplinary field of study. The aim of the present paper is to describe the role of polysemiotic corpora in the study of university website localization from a multidisciplinary perspective. More specifically, the paper gives an overview of an on-going postdoctoral research on the identity formation of Greek university websites on the web, focusing on the methodology adopted with reference to corpora compilation based on methodological tools and concepts from various fields such as Translation Studies, social semiotics, cultural studies, critical discourse analysis and marketing. The objects of comparative analysis are Greek and French original and translated (into English) university websites as well as original British and American university website versions. Up to now research findings have shown that polysemiotic corpora can be a valuable tool not only of quantitative but also of qualitative analysis of website localization both for scholars and translation professionals working with multimodal genres.

References used

https://aclanthology.org/

rate research

Human Evaluation of Creative NLG Systems: An Interdisciplinary Survey on Recent Papers

495 - Association for Computation Linguistics 2021 مقالة

We survey human evaluation in papers presenting work on creative natural language generation that have been published in INLG 2020 and ICCC 2020. The most typical human evaluation method is a scaled survey, typically on a 5 point scale, while many ot her less common methods exist. The most commonly evaluated parameters are meaning, syntactic correctness, novelty, relevance and emotional value, among many others. Our guidelines for future evaluation include clearly defining the goal of the generative system, asking questions as concrete as possible, testing the evaluation setup, using multiple different evaluation setups, reporting the entire evaluation process and potential biases clearly, and finally analyzing the evaluation results in a more profound way than merely reporting the most typical statistics.

creative nlg systems creative nlg recent papers أنظمة NLG الإبداعية الأوراق الأخيرة صناعة حمض الفوسفور

A multilabel approach to morphosyntactic probing

773 - Association for Computation Linguistics 2021 مقالة

We propose using a multilabel probing task to assess the morphosyntactic representations of multilingual word embeddings. This tweak on canonical probing makes it easy to explore morphosyntactic representations, both holistically and at the level of individual features (e.g., gender, number, case), and leads more naturally to the study of how language models handle co-occurring features (e.g., agreement phenomena). We demonstrate this task with multilingual BERT (Devlin et al., 2018), training probes for seven typologically diverse languages: Afrikaans, Croatian, Finnish, Hebrew, Korean, Spanish, and Turkish. Through this simple but robust paradigm, we verify that multilingual BERT renders many morphosyntactic features simultaneously extractable. We further evaluate the probes on six held-out languages: Arabic, Chinese, Marathi, Slovenian, Tagalog, and Yoruba. This zero-shot style of probing has the added benefit of revealing which cross-linguistic properties a language model recognizes as being shared by multiple languages.

multilabel approach multilingual bert morphosyntactic نهج Multilabel بيرت متعددة اللغات morphosyntactic. صناعة حمض الفوسفور المزيد..

An Alignment-Based Approach to Semi-Supervised Bilingual Lexicon Induction with Small Parallel Corpora

831 - Association for Computation Linguistics 2021 مقالة

Aimed at generating a seed lexicon for use in downstream natural language tasks and unsupervised methods for bilingual lexicon induction have received much attention in the academic literature recently. While interesting and fully unsupervised settin gs are unrealistic; small amounts of bilingual data are usually available due to the existence of massively multilingual parallel corpora and or linguists can create small amounts of parallel data. In this work and we demonstrate an effective bootstrapping approach for semi-supervised bilingual lexicon induction that capitalizes upon the complementary strengths of two disparate methods for inducing bilingual lexicons. Whereas statistical methods are highly effective at inducing correct translation pairs for words frequently occurring in a parallel corpus and monolingual embedding spaces have the advantage of having been trained on large amounts of data and and therefore may induce accurate translations for words absent from the small corpus. By combining these relative strengths and our method achieves state-of-the-art results on 3 of 4 language pairs in the challenging VecMap test set using minimal amounts of parallel data and without the need for a translation dictionary. We release our implementation at www.blind-review.code.

bilingual lexicon induction semi-supervised bilingual lexicon lexicon induction بليجلوكي لتعليم المعجم معجم ثنائي اللغة شبه الإشراف المعجم التعريفي صناعة حمض الفوسفور المزيد..

Multilingual Sequence Labeling Approach to solve Lexical Normalization

1153 - Association for Computation Linguistics 2021 مقالة

The task of converting a nonstandard text to a standard and readable text is known as lexical normalization. Almost all the Natural Language Processing (NLP) applications require the text data in normalized form to build quality task-specific models. Hence, lexical normalization has been proven to improve the performance of numerous natural language processing tasks on social media. This study aims to solve the problem of Lexical Normalization by formulating the Lexical Normalization task as a Sequence Labeling problem. This paper proposes a sequence labeling approach to solve the problem of Lexical Normalization in combination with the word-alignment technique. The goal is to use a single model to normalize text in various languages namely Croatian, Danish, Dutch, English, Indonesian-English, German, Italian, Serbian, Slovenian, Spanish, Turkish, and Turkish-German. This is a shared task in 2021 The 7th Workshop on Noisy User-generated Text (W-NUT)'' in which the participants are expected to create a system/model that performs lexical normalization, which is the translation of non-canonical texts into their canonical equivalents, comprising data from over 12 languages. The proposed single multilingual model achieves an overall ERR score of 43.75 on intrinsic evaluation and an overall Labeled Attachment Score (LAS) score of 63.12 on extrinsic evaluation. Further, the proposed method achieves the highest Error Reduction Rate (ERR) score of 61.33 among the participants in the shared task. This study highlights the effects of using additional training data to get better results as well as using a pre-trained Language model trained on multiple languages rather than only on one language.

lexical normalization sequence labeling approach lexical normalization task التطبيع المعجمي نهج وضع التسلسل مهام التطبيع المعجمي صناعة حمض الفوسفور المزيد..

The Corpora They Are a-Changing: a Case Study in Italian Newspapers

513 - Association for Computation Linguistics 2021 مقالة

The use of automatic methods for the study of lexical semantic change (LSC) has led to the creation of evaluation benchmarks. Benchmark datasets, however, are intimately tied to the corpus used for their creation questioning their reliability as well as the robustness of automatic methods. This contribution investigates these aspects showing the impact of unforeseen social and cultural dimensions. We also identify a set of additional issues (OCR quality, named entities) that impact the performance of the automatic methods, especially when used to discover LSC.

italian newspapers study in italian الصحف الإيطالية دراسة في الإيطالية صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

The Use of Corpora in an Interdisciplinary Approach to Localization

استخدام Corpora في نهج متعدد التخصصات للتعرية

Ask ChatGPT about the research

Read More

suggested questions