في هذه الورقة، نقدم نيريل، مجموعة بيانات روسية للتعرف على الكيان المسمى واستخراج العلاقة.نيريل أكبر بكثير من مجموعات البيانات الروسية القائمة: حتى الآن تحتوي على 56 كيلو كيانات المسماة المشروحة وعلاقات مشروحة 39 ألفا.الفرق المهم له من مجموعات البيانات السابقة هو شرح للكيانات المسماة المتداخلة، وكذلك العلاقات داخل الكيانات المتداخلة وفي مستوى الخطاب.يمكن أن تسهل نيريل تطوير نماذج جديدة يمكنها استخراج العلاقات بين الكيانات المسماة المتداخلة، وكذلك العلاقات في كل من المستويات والوثائق.يحتوي نيريل أيضا على شرح الأحداث التي تنطوي على الكيانات المسماة وأدوارها في الأحداث.تتوفر مجموعة Nerel عبر https://github.com/nerel-ds/nerel.
In this paper, we present NEREL, a Russian dataset for named entity recognition and relation extraction. NEREL is significantly larger than existing Russian datasets: to date it contains 56K annotated named entities and 39K annotated relations. Its important difference from previous datasets is annotation of nested named entities, as well as relations within nested entities and at the discourse level. NEREL can facilitate development of novel models that can extract relations between nested named entities, as well as relations on both sentence and document levels. NEREL also contains the annotation of events involving named entities and their roles in the events. The NEREL collection is available via https://github.com/nerel-ds/NEREL.
References used
https://aclanthology.org/
Pretraining-based neural network models have demonstrated state-of-the-art (SOTA) performances on natural language processing (NLP) tasks. The most frequently used sentence representation for neural-based NLP methods is a sequence of subwords that is
The stance detection task aims at detecting the stance of a tweet or a text for a target. These targets can be named entities or free-form sentences (claims). Though the task involves reasoning of the tweet with respect to a target, we find that it i
Recognizing named entities in short search engine queries is a difficult task due to their weaker contextual information compared to long sentences. Standard named entity recognition (NER) systems that are trained on grammatically correct and long se
In this paper, we propose a controllable neural generation framework that can flexibly guide dialogue summarization with personal named entity planning. The conditional sequences are modulated to decide what types of information or what perspective t
Abstract Named Entity Recognition (NER) is a fundamental NLP task, commonly formulated as classification over a sequence of tokens. Morphologically rich languages (MRLs) pose a challenge to this basic formulation, as the boundaries of named entities