في هذه الورقة، نقدم مجموعة بيانات مفهوم التحقق من قراءة جديدة تسمى vgaokao من اختبارات اللغة الصينية في Gaokao.تختلف عن الجهود الحالية، تم تصميم مجموعة البيانات الجديدة في الأصل لتقييم المتحدثين الأصليين، وبالتالي تتطلب مهارات تفاهم لغة أكثر تقدما.لمعالجة التحديات في Vgaokao، نقترح نهجا جديدا متناكج للمتخصص، الذي يختار تكرارا دليلا تكميليا مع وجود آلية تحديث استعلام رواية وأدلة تدعم تكاليف، تليها مسابقة زوجية لدفع النماذج لتعلم الفرق الدقيق بين ما شابه ذلكقطع النص.تبين التجارب أن أساليبنا تتفوق على مختلف خطوط الأساس على Vgaokao مع أدلة تكميلية مستردة، مع وجود مزايا الكفاءة والشرطية.يتم إصدار DataSet و Code لدينا لمزيد من البحث.
In this paper, we present a new verification style reading comprehension dataset named VGaokao from Chinese Language tests of Gaokao. Different from existing efforts, the new dataset is originally designed for native speakers' evaluation, thus requiring more advanced language understanding skills. To address the challenges in VGaokao, we propose a novel Extract-Integrate-Compete approach, which iteratively selects complementary evidence with a novel query updating mechanism and adaptively distills supportive evidence, followed by a pairwise competition to push models to learn the subtle difference among similar text pieces. Experiments show that our methods outperform various baselines on VGaokao with retrieved complementary evidence, while having the merits of efficiency and explainability. Our dataset and code are released for further research.
References used
https://aclanthology.org/
Implicit event argument extraction (EAE) is a crucial document-level information extraction task that aims to identify event arguments beyond the sentence level. Despite many efforts for this task, the lack of enough training data has long impeded th
Transformer-based pre-trained models, such as BERT, have achieved remarkable results on machine reading comprehension. However, due to the constraint of encoding length (e.g., 512 WordPiece tokens), a long document is usually split into multiple chun
This paper introduces the SemEval-2021 shared task 4: Reading Comprehension of Abstract Meaning (ReCAM). This shared task is designed to help evaluate the ability of machines in representing and understanding abstract concepts.Given a passage and the
We propose a simple method to generate multilingual question and answer pairs on a large scale through the use of a single generative model. These synthetic samples can be used to improve the zero-shot performance of multilingual QA models on target
This work describes the adaptation of a pretrained sequence-to-sequence model to the task of scientific claim verification in the biomedical domain. We propose a system called VerT5erini that exploits T5 for abstract retrieval, sentence selection, an