Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Event Prominence Extraction Combining a Knowledge-Based Syntactic Parser and a BERT Classifier for Dutch

استخراج الحدث بالبروز يجمع بين محلل نحانوني مقرها المعرفة ومصنف بيرت للهولندية

612 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

prominence extraction combining combining a knowledge-based extraction combining استخراج البروز يجمع الجمع بين المعرفة القائمة استخراج الجمع صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

المهمة الأساسية في استخراج المعلومات هي اكتشاف الحدث الذي يحدد مشغلات الحدث في الجمل التي يتم تصنيفها عادة في أنواع الأحداث. في هذه الدراسة، يعتبر الحدث وحدة لقياس التنوع والتشابه في مقالات إخبارية في إطار نظام أخبار التوصية. فشلت نهج اكتشاف الحدث المستندة إلى التصنيف الحالي في التعامل مع مجموعة متنوعة من الأحداث المعبر عنها في مواقف العالم الحقيقي. للتغلب على ذلك، نهدف إلى أداء تصنيف حفلات الأحداث واستكشاف ما إذا كان نموذج محول قادر على تصنيف معلومات جديدة في فصول بروز أقل وأكثر عمومية. بعد مقارنة خط الأساس من آلة ناقلات الدعم (SVM) وعروض التصنيف القائم على المحولات لدينا في العديد من تنسيقات سبين الأحداث، فقد تم تصميمنا حدث متعدد الكلام يمتد كشروط سليمة. يتم تغذية تلك الموجودة في تصنيفنا البرز الذي يتم ضبطه بشكل جيد على Adgeddings الهولندية المدربة مسبقا. علاوة على ذلك، نحن نتفوق على خط أنابيب لنهج حقل عشوائي مشروط (CRF) في اكتشاف كلمة الزناد في الأحداث والتصنيف المستند إلى BERT. إلى حد ما من معرفتنا، نقدم أول نهج استخراج الأحداث الذي يجمع بين محلل نصلي مقصورات مقره الخبراء مع مصنف تحويل محول للهولندية.

A core task in information extraction is event detection that identifies event triggers in sentences that are typically classified into event types. In this study an event is considered as the unit to measure diversity and similarity in news articles in the framework of a news recommendation system. Current typology-based event detection approaches fail to handle the variety of events expressed in real-world situations. To overcome this, we aim to perform event salience classification and explore whether a transformer model is capable of classifying new information into less and more general prominence classes. After comparing a Support Vector Machine (SVM) baseline and our transformer-based classifier performances on several event span formats, we conceived multi-word event spans as syntactic clauses. Those are fed into our prominence classifier which is fine-tuned on pre-trained Dutch BERT word embeddings. On top of that we outperform a pipeline of a Conditional Random Field (CRF) approach to event-trigger word detection and the BERT-based classifier. To the best of our knowledge we present the first event extraction approach that combines an expert-based syntactic parser with a transformer-based classifier for Dutch.

References used

https://aclanthology.org/

rate research

A Transition-based Parser for Unscoped Episodic Logical Forms

751 - Association for Computation Linguistics 2021 مقالة

Episodic Logic: Unscoped Logical Form'' (EL-ULF) is a semantic representation capturing predicate-argument structure as well as more challenging aspects of language within the Episodic Logic formalism. We present the first learned approach for parsin g sentences into ULFs, using a growing set of annotated examples. The results provide a strong baseline for future improvement. Our method learns a sequence-to-sequence model for predicting the transition action sequence within a modified cache transition system. We evaluate the efficacy of type grammar-based constraints, a word-to-symbol lexicon, and transition system state features in this task. Our system is available at https://github.com/genelkim/ulf-transition-parser. We also present the first official annotated ULF dataset at https://www.cs.rochester.edu/u/gkim21/ulf/resources/.

unscoped logical form نموذج منطقي غير مستقيم صناعة حمض الفوسفور

Incorporating medical knowledge in BERT for clinical relation extraction

635 - Association for Computation Linguistics 2021 مقالة

In recent years pre-trained language models (PLM) such as BERT have proven to be very effective in diverse NLP tasks such as Information Extraction, Sentiment Analysis and Question Answering. Trained with massive general-domain text, these pre-traine d language models capture rich syntactic, semantic and discourse information in the text. However, due to the differences between general and specific domain text (e.g., Wikipedia versus clinic notes), these models may not be ideal for domain-specific tasks (e.g., extracting clinical relations). Furthermore, it may require additional medical knowledge to understand clinical text properly. To solve these issues, in this research, we conduct a comprehensive examination of different techniques to add medical knowledge into a pre-trained BERT model for clinical relation extraction. Our best model outperforms the state-of-the-art systems on the benchmark i2b2/VA 2010 clinical relation extraction dataset.

حقيقي clinical relation extraction استخراج العلاقة السريرية صناعة حمض الفوسفور

CodRED: A Cross-Document Relation Extraction Dataset for Acquiring Knowledge in the Wild

952 - Association for Computation Linguistics 2021 مقالة

Existing relation extraction (RE) methods typically focus on extracting relational facts between entity pairs within single sentences or documents. However, a large quantity of relational facts in knowledge bases can only be inferred across documents in practice. In this work, we present the problem of cross-document RE, making an initial step towards knowledge acquisition in the wild. To facilitate the research, we construct the first human-annotated cross-document RE dataset CodRED. Compared to existing RE datasets, CodRED presents two key challenges: Given two entities, (1) it requires finding the relevant documents that can provide clues for identifying their relations; (2) it requires reasoning over multiple documents to extract the relational facts. We conduct comprehensive experiments to show that CodRED is challenging to existing RE methods including strong BERT-based models.

تلخيص حوار الخدمة relation extraction dataset existing relation extraction مجموعة بيانات استخراج العلاقة استخراج العلاقة الحالية صناعة حمض الفوسفور

End-to-end mBERT based Seq2seq Enhanced Dependency Parser with Linguistic Typology knowledge

711 - Association for Computation Linguistics 2021 مقالة

We describe the NUIG solution for IWPT 2021 Shared Task of Enhanced Dependency (ED) parsing in multiple languages. For this shared task, we propose and evaluate an End-to-end Seq2seq mBERT-based ED parser which predicts the ED-parse tree of a given i nput sentence as a relative head-position tag-sequence. Our proposed model is a multitasking neural-network which performs five key tasks simultaneously namely UPOS tagging, UFeat tagging, Lemmatization, Dependency-parsing and ED-parsing. Furthermore we utilise the linguistic typology available in the WALS database to improve the ability of our proposed end-to-end parser to transfer across languages. Results show that our proposed Seq2seq ED-parser performs on par with state-of-the-art ED-parser despite having a much simpler de- sign.

enhanced dependency parser enhanced dependency محاضر التبعية المحسن تعزيز الاعتماد صناعة حمض الفوسفور

BERT-based Multi-Task Model for Country and Province Level MSA and Dialectal Arabic Identification

790 - Association for Computation Linguistics 2021 مقالة

Dialect and standard language identification are crucial tasks for many Arabic natural language processing applications. In this paper, we present our deep learning-based system, submitted to the second NADI shared task for country-level and province -level identification of Modern Standard Arabic (MSA) and Dialectal Arabic (DA). The system is based on an end-to-end deep Multi-Task Learning (MTL) model to tackle both country-level and province-level MSA/DA identification. The latter MTL model consists of a shared Bidirectional Encoder Representation Transformers (BERT) encoder, two task-specific attention layers, and two classifiers. Our key idea is to leverage both the task-discriminative and the inter-task shared features for country and province MSA/DA identification. The obtained results show that our MTL model outperforms single-task models on most subtasks.

province level msa dialectal arabic identification dialectal arabic مستوى المحافظة MSA تحديد الهوية العربية الجدلي منطقيا عربي صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Event Prominence Extraction Combining a Knowledge-Based Syntactic Parser and a BERT Classifier for Dutch

استخراج الحدث بالبروز يجمع بين محلل نحانوني مقرها المعرفة ومصنف بيرت للهولندية

Ask ChatGPT about the research

Read More

suggested questions