New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Learning Entity-Likeness with Multiple Approximate Matches for Biomedical NER

التعلم كيان تشابه مع العديد من المباريات التقريبية ل Biomedical NER

284 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

multiple approximate matches biomedical named entities approximate matches مباريات تقريبية متعددة الكيانات المسماة الطبية الحيوية المباريات التقريبية صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Biomedical Named Entities are complex, so approximate matching has been used to improve entity coverage. However, the usual approximate matching approach fetches only one matching result, which is often noisy. In this work, we propose a method for biomedical NER that fetches multiple approximate matches for a given phrase to leverage their variations to estimate entity-likeness. The model uses pooling to discard the unnecessary information from the noisy matching results, and learn the entity-likeness of the phrase with multiple approximate matches. Experimental results on three benchmark datasets from the biomedical domain, BC2GM, NCBI-disease, and BC4CHEMD, demonstrate the effectiveness. Our model improves the average by up to +0.21 points compared to a BioBERT-based NER.

References used

https://aclanthology.org/

rate research

Towards Realistic Single-Task Continuous Learning Research for NER

297 - Association for Computation Linguistics 2021 مقالة

There is an increasing interest in continuous learning (CL), as data privacy is becoming a priority for real-world machine learning applications. Meanwhile, there is still a lack of academic NLP benchmarks that are applicable for realistic CL setting s, which is a major challenge for the advancement of the field. In this paper we discuss some of the unrealistic data characteristics of public datasets, study the challenges of realistic single-task continuous learning as well as the effectiveness of data rehearsal as a way to mitigate accuracy loss. We construct a CL NER dataset from an existing publicly available dataset and release it along with the code to the research community.

single-task continuous learning realistic single-task continuous continuous learning التعلم المستمر المهمة مهمة واحدة واقعية مستمرة صناعة حمض الفوسفور

Noisy-Labeled NER with Confidence Estimation

449 - Association for Computation Linguistics 2021 مقالة

Recent studies in deep learning have shown significant progress in named entity recognition (NER). However, most existing works assume clean data annotation, while real-world scenarios typically involve a large amount of noises from a variety of sour ces (e.g., pseudo, weak, or distant annotations). This work studies NER under a noisy labeled setting with calibrated confidence estimation. Based on empirical observations of different training dynamics of noisy and clean labels, we propose strategies for estimating confidence scores based on local and global independence assumptions. We partially marginalize out labels of low confidence with a CRF model. We further propose a calibration method for confidence scores based on the structure of entity labels. We integrate our approach into a self-training framework for boosting performance. Experiments in general noisy settings with four languages and distantly labeled settings demonstrate the effectiveness of our method.

noisy-labeled ner confidence estimation ner صاخبة المسمى ner تقدير الثقة نير صناعة حمض الفوسفور المزيد..

Learning with Different Amounts of Annotation: From Zero to Many Labels

310 - Association for Computation Linguistics 2021 مقالة

Training NLP systems typically assumes access to annotated data that has a single human label per example. Given imperfect labeling from annotators and inherent ambiguity of language, we hypothesize that single label is not sufficient to learn the sp ectrum of language interpretation. We explore new annotation distribution schemes, assigning multiple labels per example for a small subset of training examples. Introducing such multi label examples at the cost of annotating fewer examples brings clear gains on natural language inference task and entity typing task, even when we simply first train with a single label data and then fine tune with multi label examples. Extending a MixUp data augmentation framework, we propose a learning algorithm that can learn from training examples with different amount of annotation (with zero, one, or multiple labels). This algorithm efficiently combines signals from uneven training data and brings additional gains in low annotation budget and cross domain settings. Together, our method achieves consistent gains in two tasks, suggesting distributing labels unevenly among training examples can be beneficial for many NLP tasks.

تقييم الاستدلال القوي single label labels ضع الكلمة المناسبة تسمية واحدة تسميات صناعة حمض الفوسفور المزيد..

WikiGUM: Exhaustive Entity Linking for Wikification in 12 Genres

593 - Association for Computation Linguistics 2021 مقالة

Previous work on Entity Linking has focused on resources targeting non-nested proper named entity mentions, often in data from Wikipedia, i.e. Wikification. In this paper, we present and evaluate WikiGUM, a fully wikified dataset, covering all mentio ns of named entities, including their non-named and pronominal mentions, as well as mentions nested within other mentions. The dataset covers a broad range of 12 written and spoken genres, most of which have not been included in Entity Linking efforts to date, leading to poor performance by a pretrained SOTA system in our evaluation. The availability of a variety of other annotations for the same data also enables further research on entities in context.

exhaustive entity linking exhaustive entity كيان شامل يربط كيان شامل صناعة حمض الفوسفور

Synchronous Dual Network with Cross-Type Attention for Joint Entity and Relation Extraction

407 - Association for Computation Linguistics 2021 مقالة

Joint entity and relation extraction is challenging due to the complex interaction of interaction between named entity recognition and relation extraction. Although most existing works tend to jointly train these two tasks through a shared network, t hey fail to fully utilize the interdependence between entity types and relation types. In this paper, we design a novel synchronous dual network (SDN) with cross-type attention via separately and interactively considering the entity types and relation types. On the one hand, SDN adopts two isomorphic bi-directional type-attention LSTM to encode the entity type enhanced representations and the relation type enhanced representations, respectively. On the other hand, SDN explicitly models the interdependence between entity types and relation types via cross-type attention mechanism. In addition, we also propose a new multi-task learning strategy via modeling the interaction of two types of information. Experiments on NYT and WebNLG datasets verify the effectiveness of the proposed model, achieving state-of-the-art performance.

التخصيب التصنيف المنخفض synchronous dual network شبكة مزدوجة متزامن صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Learning Entity-Likeness with Multiple Approximate Matches for Biomedical NER

التعلم كيان تشابه مع العديد من المباريات التقريبية ل Biomedical NER

Ask ChatGPT about the research

Read More

suggested questions