المستندات العلمية مليئة بالقياسات المذكورة في تنسيقات وأنماط مختلفة. على هذا النحو، في وثيقة ذات كميات متعددة والكيانات المقاسة، فإن مهمة ربط كل كمية إلى كيانها المقاس المقابل أمر صعب. وبالتالي، من الضروري الحصول على طريقة لاستخراج جميع القياسات والسمات ذات الصلة بكفاءة. تحقيقا لهذه الغاية، في هذه الورقة، نقترح نموذجا جديدا لمهمة استخراج العلاقات المتعلقة بالقياس (MRE) هدفه هو التعرف على العلاقة بين الكيانات والكميات والظروف المقاسة المذكورة في وثيقة. توظف نموذجنا هندسا عميقا قائمة على الترجمة من أجل تحقيق الكلمات المهمة ديناميكيا في الوثيقة لتصنيف العلاقة بين زوج من الكيانات. علاوة على ذلك، نقدم تقنية تنظيمية جديدة تعتمد على اختناق المعلومات (IB) لتصفية المعلومات الصاخبة من المجموعة الناجمة عن الكلمات المهمة. تجاربنا على مجموعة بيانات مهمة Semeval 2021 الأخيرة تكشف عن فعالية النموذج المقترح.
Scientific documents are replete with measurements mentioned in various formats and styles. As such, in a document with multiple quantities and measured entities, the task of associating each quantity to its corresponding measured entity is challenging. Thus, it is necessary to have a method to efficiently extract all measurements and attributes related to them. To this end, in this paper, we propose a novel model for the task of measurement relation extraction (MRE) whose goal is to recognize the relation between measured entities, quantities, and conditions mentioned in a document. Our model employs a deep translation-based architecture to dynamically induce the important words in the document to classify the relation between a pair of entities. Furthermore, we introduce a novel regularization technique based on Information Bottleneck (IB) to filter out the noisy information from the induced set of important words. Our experiments on the recent SemEval 2021 Task 8 datasets reveal the effectiveness of the proposed model.
References used
This work describes our approach for subtasks of SemEval-2021 Task 8: MeasEval: Counts and Measurements which took the official first place in the competition. To solve all subtasks we use multi-task learning in a question-answering-like manner. We a
This paper presents the system for SemEval 2021 Task 8 (MeasEval). MeasEval is a novel span extraction, classification, and relation extraction task focused on finding quantities, attributes of these quantities, and additional information, including
MeasEval aims at identifying quantities along with the entities that are measured with additional properties within English scientific documents. The variety of styles used makes measurements, a most crucial aspect of scientific writing, challenging
This paper explains the design of a heterogeneous system that ranked eighth in competition in SemEval2021 Task 8. We analyze ablation experiments and demonstrate how the system components, namely tokenizer, unit identifier, modifier classifier, and l
We describe MeasEval, a SemEval task of extracting counts, measurements, and related context from scientific documents, which is of significant importance to the creation of Knowledge Graphs that distill information from the scientific literature. Th