تقدم هذه الورقة مساهمتنا الفائزة في مهمة Semeval 2021 8: MeasessVal.الغرض من هذه المهمة هو تحديد العدد والقياسات من الخطاب العلمي السريري، بما في ذلك الكميات والكيانات والخصائص والوحدات والوحدات والمعدلات وعلاقاتهم المتبادلة.يمكن أن تهدف هذه المهمة إلى مشكلة استخراج كيان مشترك وعلاقة.وفقا لذلك، نقترح Conner، أداة استخراج العد والقياس التي يمكن أن تحدد الكيانات والعلاقات المقابلة في نموذج خط أنابيب من خطوتين.نحن نقدم وصفا مفصلا للنموذج المقترح فيما يلي.علاوة على ذلك، يتم التحقيق في تأثير الوحدات الأساسية والمخططات الفنية المعنية لدينا أيضا.
This paper presents our wining contribution to SemEval 2021 Task 8: MeasEval. The purpose of this task is identifying the counts and measurements from clinical scientific discourse, including quantities, entities, properties, qualifiers, units, modifiers, and their mutual relations. This task can be induced to a joint entity and relation extraction problem. Accordingly, we propose CONNER, a cascade count and measurement extraction tool that can identify entities and the corresponding relations in a two-step pipeline model. We provide a detailed description of the proposed model hereinafter. Furthermore, the impact of the essential modules and our in-process technical schemes are also investigated.
References used
https://aclanthology.org/
GECko+ : a Grammatical and Discourse Error Correction Tool We introduce GECko+, a web-based writing assistance tool for English that corrects errors both at the sentence and at the discourse level. It is based on two state-of-the-art models for gramm
Machine learning-based prediction of material properties is often hampered by the lack of sufficiently large training data sets. The majority of such measurement data is embedded in scientific literature and the ability to automatically extract these
Scientific documents are replete with measurements mentioned in various formats and styles. As such, in a document with multiple quantities and measured entities, the task of associating each quantity to its corresponding measured entity is challengi
Being able to generate accurate word alignments is useful for a variety of tasks. While statistical word aligners can work well, especially when parallel training data are plentiful, multilingual embedding models have recently been shown to give good
Best-worst Scaling (BWS) is a methodology for annotation based on comparing and ranking instances, rather than classifying or scoring individual instances. Studies have shown the efficacy of this methodology applied to NLP tasks in terms of a higher