Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Translation Memory Retrieval Using Lucene

استرجاع ذاكرة الترجمة باستخدام لوسين

615 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Translation Memory (TM) system, a major component of computer-assisted translation (CAT), is widely used to improve human translators' productivity by making effective use of previously translated resource. We propose a method to achieve high-speed retrieval from a large translation memory by means of similarity evaluation based on vector model, and present the experimental result. Through our experiment using Lucene, an open source information retrieval search engine, we conclude that it is possible to achieve real-time retrieval speed of about tens of microseconds even for a large translation memory with 5 million segment pairs.

References used

https://aclanthology.org/

rate research

A Comparison of the Word Similarity Measurement in English-Arabic Translation Memory Segment Retrieval Including an Inflectional Affix Intervention

738 - Association for Computation Linguistics 2021 مقالة

The aim of this paper is to investigate the similarity measurement approach of translation memory (TM) in five representative computer-aided translation (CAT) tools when retrieving inflectional verb-variation sentences in Arabic to English translatio n. In English, inflectional affixes in verbs include suffixes only; unlike English, verbs in Arabic derive voice, mood, tense, number and person through various inflectional affixes e.g. pre or post a verb root. The research question focuses on establishing whether the TM similarity algorithm measures a combination of the inflectional affixes as a word or as a character intervention when retrieving a segment. If it is dealt with as a character intervention, are the types of intervention penalized equally or differently? This paper experimentally examines, through a black box testing methodology and a test suite instrument, the penalties that TM systems' current algorithms impose when input segments and retrieved TM sources are exactly the same, except for a difference in an inflectional affix. It would be expected that, if TM systems had some linguistic knowledge, the penalty would be very light, which would be useful to translators, since a high-scoring match would be presented near the top of the list of proposals. However, analysis of TM systems' output shows that inflectional affixes are penalized more heavily than expected, and in different ways. They may be treated as an intervention on the whole word, or as a single character change.

segment retrieval including memory segment retrieval retrieval including استرجاع القطاع بما في ذلك استرجاع قطاع الذاكرة استرجاع بما في ذلك صناعة حمض الفوسفور المزيد..

Towards New Generation Translation Memory Systems

724 - Association for Computation Linguistics 2021 مقالة

Despite the enormous popularity of Translation Memory systems and the active research in the field, their language processing features still suffer from certain limitations. While many recent papers focus on semantic matching capabilities of TMs, thi s planned study will address how these tools perform when dealing with longer segments and whether this could be a cause of lower match scores. An experiment will be carried out on corpora from two different (repetitive) domains. Following the results, recommendations for future developments of new TMs will be made.

generation translation memory translation memory systems ذاكرة الترجمة الجيل أنظمة ذاكرة الترجمة صناعة حمض الفوسفور

Introducing linguistic transformation to improve translation memory retrieval. Results of a professional translators' survey for Spanish, French and Arabic

1050 - Association for Computation Linguistics 2021 مقالة

Translation memory systems (TMS) are the main component of computer-assisted translation (CAT) tools. They store translations allowing to save time by presenting translations on the database through matching of several types such as fuzzy matches, wh ich are calculated by algorithms like the edit distance. However, studies have demonstrated the linguistic deficiencies of these systems and the difficulties in data retrieval or obtaining a high percentage of matching, especially after the application of syntactic and semantic transformations as the active/passive voice change, change of word order, substitution by a synonym or a personal pronoun, for instance. This paper presents the results of a pilot study where we analyze the qualitative and quantitative data of questionnaires conducted with professional translators of Spanish, French and Arabic in order to improve the effectiveness of TMS and explore all possibilities to integrate further linguistic processing from ten transformation types. The results are encouraging, and they allowed us to find out about the translation process itself; from which we propose a pre-editing processing tool to improve the matching and retrieving processes.

french and arabic introducing linguistic transformation improve translation memory الفرنسية والعربية تقديم التحول اللغوي تحسين ذاكرة الترجمة صناعة حمض الفوسفور المزيد..

Integration of Machine Translation and Translation Memory: Post-Editing Efforts

848 - Association for Computation Linguistics 2021 مقالة

The development of Translation Technologies, like Translation Memory and Machine Translation, has completely changed the translation industry and translator's workflow in the last decades. Nevertheless, TM and MT have been developed separately until very recently. This ongoing project will study the external integration of TM and MT, examining if the productivity and post-editing efforts of translators are higher or lower than using only TM. To this end, we will conduct an experiment where Translation students and professional translators will be asked to translate two short texts; then we will check the post-editing efforts (temporal, technical and cognitive efforts) and the quality of the translated texts.

يصنع translation memory ذاكرة الترجمة صناعة حمض الفوسفور

Improving Arabic Information Retrieval Results Semantically Using Ontology

2973 - Aِl-Baath University 2016 ورقة بحثية

This research proposes a new way to improve the search outcome of Arabic semantics by abstractly summarizing the Arabic texts (Abstractive Summary) using natural language processing algorithms(NLP),Word Sense Disambiguation (WSD) and techniques o f measuring Semantic Similarity in Arabic WordNet Ontology.

معالجة اللغات الطبيعية Semantic analysis استرجاع المعلومات التلخيص التجريدي الأنتولوجيا العربية ووردنت العلاقة الدلالية المفاهيمية التشابهية الدلالية التحليل الدلالي حل غموض معاني الكلمات (Natural Language Processing (NLP (Information Retrieval (IR Abstractive Summarization (Arabic WordNet (AWN Conceptual Semantic Relation Semantic Similarity (Word Sense Disambiguation (WSD المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Translation Memory Retrieval Using Lucene

استرجاع ذاكرة الترجمة باستخدام لوسين

Ask ChatGPT about the research

Read More

suggested questions