نقترح سلسلة من النماذج العصبية التي تنفذ تصنيف الجملة، والاعتراف العبارة، واستخراج ثلاثي لإجراء المساهمات العلمية تلقائيا من منشورات NLP. لتحديد أحكام المساهمة الأكثر أهمية في ورقة، استخدمنا مصنف مقرا له بالميزات الموضعية (SubTask 1). تم استخدام نموذج BERT-CRF للتعرف على العبارات ذات الصلة وتمييزها في جمل المساهمة (SubTask 2). قمنا بتصنيف ثلاث مرات إلى عدة أنواع بناء على ما إذا كانت عناصرها وكيف تم التعبير عن عناصرها في نص، ومعالجتها كل نوع باستخدام مصنفين منفصلين مقرهم بالمقيمين بالإضافة إلى القواعد (SubTask 3). تم تصنيف نظامنا رسميا في المرحلة الأولى في تقييم المرحلة الأولى وأول مرة في كلا جزأين التقييم المرحلة 2. بعد إصلاح خطأ التقديم في PHARESE 1، فإن نهجنا يؤدي إلى أفضل النتائج بشكل عام. في هذه الورقة، بالإضافة إلى وصف للنظام، نقدم أيضا تحليلا إضافيا لنتائجنا، مما يسلط الضوء على نقاط القوة والقيود لها. نجعل شفرةنا متوفرة علنا في https://github.com/liu-hy/nlp-contrib-graph.
We propose a cascade of neural models that performs sentence classification, phrase recognition, and triple extraction to automatically structure the scholarly contributions of NLP publications. To identify the most important contribution sentences in a paper, we used a BERT-based classifier with positional features (Subtask 1). A BERT-CRF model was used to recognize and characterize relevant phrases in contribution sentences (Subtask 2). We categorized the triples into several types based on whether and how their elements were expressed in text, and addressed each type using separate BERT-based classifiers as well as rules (Subtask 3). Our system was officially ranked second in Phase 1 evaluation and first in both parts of Phase 2 evaluation. After fixing a submission error in Pharse 1, our approach yields the best results overall. In this paper, in addition to a system description, we also provide further analysis of our results, highlighting its strengths and limitations. We make our code publicly available at https://github.com/Liu-Hy/nlp-contrib-graph.
References used
https://aclanthology.org/
This paper describes the system we built as the YNU-HPCC team in the SemEval-2021 Task 11: NLPContributionGraph. This task involves first identifying sentences in the given natural language processing (NLP) scholarly articles that reflect research co
Research in Natural Language Processing is making rapid advances, resulting in the publication of a large number of research papers. Finding relevant research papers and their contribution to the domain is a challenging problem. In this paper, we add
This paper describes our submission to the SemEval-2021 shared task on Lexical Complexity Prediction. We approached it as a regression problem and present an ensemble combining four systems, one feature-based and three neural with fine-tuning, freque
This paper describes the winning system in the End-to-end Pipeline phase for the NLPContributionGraph task. The system is composed of three BERT-based models and the three models are used to extract sentences, entities and triples respectively. Exper
The SemEval 2021 task 5: Toxic Spans Detection is a task of identifying considered-toxic spans in text, which provides a valuable, automatic tool for moderating online contents. This paper represents the second-place method for the task, an ensemble