في هذه الورقة، نهدف إلى معالجة التحديات المحيطة بترجمة النص الصيني القديم: (1) الفجوة اللغوية بسبب الاختلاف في عصائر النتائج في الترجمات التي هي فقيرة في الجودة، و (2) تفتقد معظم الترجمات المعلومات السياقيةغالبا ما يكون هذا أمرا ضروريا للغاية لفهم النص.تحقيقا لهذه الغاية، نحسن تقنيات الترجمة السابقة عن طريق اقتراح ما يلي: نحن نورد المهمة كهجوم تنبؤ متعدد الملصقات حيث يتنبأ النموذج كل من الترجمة وعصرها الخاص.نلاحظ أن هذا يساعد على سد الفجوة اللغوية كما يتم استخدام السياق الزمني أيضا كمعلومات مساعدة.نحن نقوم بالتحقق من طريقنا على كوربوس موازية مشروح مع معلومات التسلسل الزمني وإظهار فعاليتها تجريبيا في إنتاج مخرجات الترجمة عالية الجودة.نقوم بإصدار كل من التعليمات البرمجية وبيانات البحث في المستقبل.
In this paper, we aim to address the challenges surrounding the translation of ancient Chinese text: (1) The linguistic gap due to the difference in eras results in translations that are poor in quality, and (2) most translations are missing the contextual information that is often very crucial to understanding the text. To this end, we improve upon past translation techniques by proposing the following: We reframe the task as a multi-label prediction task where the model predicts both the translation and its particular era. We observe that this helps to bridge the linguistic gap as chronological context is also used as auxiliary information. We validate our framework on a parallel corpus annotated with chronology information and show experimentally its efficacy in producing quality translation outputs. We release both the code and the data for future research.
References used
https://aclanthology.org/
How to generate summaries of different styles without requiring corpora in the target styles, or training separate models? We present two novel methods that can be deployed during summary decoding on any pre-trained Transformer-based summarization mo
Adaptive Machine Translation purports to dynamically include user feedback to improve translation quality. In a post-editing scenario, user corrections of machine translation output are thus continuously incorporated into translation models, reducing
We study the task of learning and evaluating Chinese idiom embeddings. We first construct a new evaluation dataset that contains idiom synonyms and antonyms. Observing that existing Chinese word embedding methods may not be suitable for learning idio
Gender bias in word embeddings gradually becomes a vivid research field in recent years. Most studies in this field aim at measurement and debiasing methods with English as the target language. This paper investigates gender bias in static word embed
Although exposure bias has been widely studied in some NLP tasks, it faces its unique challenges in dialogue response generation, the representative one-to-various generation scenario.In real human dialogue, there are many appropriate responses for t