في هذه الورقة، نصف إنشادنا إلى المهمة المشتركة بمقاييس WMT 2021.نستخدم الأسئلة والأجوبة التي تم إنشاؤها تلقائيا لتقييم جودة أنظمة الترجمة الآلية (MT).إن تقديمنا يبني على إطار MTEQA المقترح مؤخرا.تظهر التجارب على مجموعات بيانات تقييم WMT20 أنه على مستوى النظام، يحقق Mteqa Metric أداء قابلا للمقارنة مع حلول حديثة أخرى، مع مراعاة كمية معينة فقط من الترجمة بأكملها.
In this paper, we describe our submission to the WMT 2021 Metrics Shared Task. We use the automatically-generated questions and answers to evaluate the quality of Machine Translation (MT) systems. Our submission builds upon the recently proposed MTEQA framework. Experiments on WMT20 evaluation datasets show that at the system-level the MTEQA metric achieves performance comparable with other state-of-the-art solutions, while considering only a certain amount of information from the whole translation.
References used
https://aclanthology.org/
This paper presents the results of the WMT21 Metrics Shared Task. Participants were asked to score the outputs of the translation systems competing in the WMT21 News Translation Task with automatic metrics on two different domains: news and TED talks
This paper describes Papago submission to the WMT 2021 Quality Estimation Task 1: Sentence-level Direct Assessment. Our multilingual Quality Estimation system explores the combination of Pretrained Language Models and Multi-task Learning architecture
This paper describes Charles University sub-mission for Terminology translation Shared Task at WMT21. The objective of this task is to design a system which translates certain terms based on a provided terminology database, while preserving high over
This paper presents the JHU-Microsoft joint submission for WMT 2021 quality estimation shared task. We only participate in Task 2 (post-editing effort estimation) of the shared task, focusing on the target-side word-level quality estimation. The tech
This report describes Microsoft's machine translation systems for the WMT21 shared task on large-scale multilingual machine translation. We participated in all three evaluation tracks including Large Track and two Small Tracks where the former one is