نقدم هذا بموجبه تقديمنا إلى المهمة المشتركة في تقييم الدقة في مؤتمر INLG 2021.يعتمد بروتوكول التقييم لدينا على ثلاثة مكونات رئيسية؛القواعد والصفوف النصية المصنفة التي تعلق مسبقا على مجموعة البيانات، وهو عبقري بشري يتحقق من التوضيح المسبق، وواجهة الويب التي تسهل هذا التحقق من الصحة.يتكون التقديم لدينا في حقيقة وجود تقريرين؛نحلل أولا فقط أداء القواعد والصفوفات المصنفة (الشرحين قبل التوضيحية)، ثم التقييم البشري يساعده الشروح السابقة السابقة باستخدام واجهة الويب (الهجين).رمز واجهة الويب والصفوف هو متاح علنا.
We hereby present our submission to the Shared Task in Evaluating Accuracy at the INLG 2021 Conference. Our evaluation protocol relies on three main components; rules and text classifiers that pre-annotate the dataset, a human annotator that validates the pre-annotations, and a web interface that facilitates this validation. Our submission consists in fact of two submissions; we first analyze solely the performance of the rules and classifiers (pre-annotations), and then the human evaluation aided by the former pre-annotations using the web interface (hybrid). The code for the web interface and the classifiers is publicly available.
References used
https://aclanthology.org/
The Shared Task on Evaluating Accuracy focused on techniques (both manual and automatic) for evaluating the factual accuracy of texts produced by neural NLG systems, in a sports-reporting domain. Four teams submitted evaluation techniques for this ta
Research in NLP is often supported by experimental results, and improved reporting of such results can lead to better understanding and more reproducible science. In this paper we analyze three statistical estimators for expected validation performan
This paper presents the submission of Huawei Translation Service Center (HW-TSC) to WMT 2021 Triangular MT Shared Task. We participate in the Russian-to-Chinese task under the constrained condition. We use Transformer architecture and obtain the best
This paper presents the Bering Lab's submission to the shared tasks of the 8th Workshop on Asian Translation (WAT 2021) on JPC2 and NICT-SAP. We participated in all tasks on JPC2 and IT domain tasks on NICT-SAP. Our approach for all tasks mainly focu
We describe our submitted system to the 2021 Shared Task on Sarcasm and Sentiment Detection in Arabic (Abu Farha et al., 2021). We tackled both subtasks, namely Sarcasm Detection (Subtask 1) and Sentiment Analysis (Subtask 2). We used state-of-the-ar