New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Improving Distantly-Supervised Named Entity Recognition with Self-Collaborative Denoising Learning

تحسين الاعتراف الكياري المسمى بشكل مسمى مع التعلم التعاوني للانحدار الذاتي

288 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

الرسم البياني غير متجانسة في القاموس distantly-supervised named entity improving distantly-supervised named الكيان المسمى تحت إشراف تحسين اسمه المسمى بشكل مسمى صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Distantly supervised named entity recognition (DS-NER) efficiently reduces labor costs but meanwhile intrinsically suffers from the label noise due to the strong assumption of distant supervision. Typically, the wrongly labeled instances comprise numbers of incomplete and inaccurate annotations, while most prior denoising works are only concerned with one kind of noise and fail to fully explore useful information in the training set. To address this issue, we propose a robust learning paradigm named Self-Collaborative Denoising Learning (SCDL), which jointly trains two teacher-student networks in a mutually-beneficial manner to iteratively perform noisy label refinery. Each network is designed to exploit reliable labels via self denoising, and two networks communicate with each other to explore unreliable annotations by collaborative denoising. Extensive experimental results on five real-world datasets demonstrate that SCDL is superior to state-of-the-art DS-NER denoising methods.

References used

https://aclanthology.org/

rate research

Improving Distantly Supervised Relation Extraction with Self-Ensemble Noise Filtering

332 - Association for Computation Linguistics 2021 مقالة

Distantly supervised models are very popular for relation extraction since we can obtain a large amount of training data using the distant supervision method without human annotation. In distant supervision, a sentence is considered as a source of a tuple if the sentence contains both entities of the tuple. However, this condition is too permissive and does not guarantee the presence of relevant relation-specific information in the sentence. As such, distantly supervised training data contains much noise which adversely affects the performance of the models. In this paper, we propose a self-ensemble filtering mechanism to filter out the noisy samples during the training process. We evaluate our proposed framework on the New York Times dataset which is obtained via distant supervision. Our experiments with multiple state-of-the-art neural relation extraction models show that our proposed filtering mechanism improves the robustness of the models and increases their F1 scores.

improving distantly supervised distantly supervised relation supervised relation extraction تحسين الإشراف المستمر العلاقة التي أشرفت بشكل مسيير استخراج العلاقة الإشراف صناعة حمض الفوسفور المزيد..

Improved Named Entity Recognition for Noisy Call Center Transcripts

425 - Association for Computation Linguistics 2021 مقالة

We explore the application of state-of-the-art NER algorithms to ASR-generated call center transcripts. Previous work in this domain focused on the use of a BiLSTM-CRF model which relied on Flair embeddings; however, such a model is unwieldy in terms of latency and memory consumption. In a production environment, end users require low-latency models which can be readily integrated into existing pipelines. To that end, we present two different models which can be utilized based on the latency and accuracy requirements of the user. First, we propose a set of models which utilize state-of-the-art Transformer language models (RoBERTa) to develop a high-accuracy NER system trained on custom annotated set of call center transcripts. We then use our best-performing Transformer-based model to label a large number of transcripts, which we use to pretrain a BiLSTM-CRF model and further fine-tune on our annotated dataset. We show that this model, while not as accurate as its Transformer-based counterpart, is highly effective in identifying items which require redaction for privacy law compliance. Further, we propose a new general annotation scheme for NER in the call-center environment.

تقييم الصورة تقييم improved named entity noisy call center تحسين الكيان المسمى مركز الاتصال الصاخب صناعة حمض الفوسفور

Improving Named Entity Recognition in Spoken Dialog Systems by Context and Speech Pattern Modeling

357 - Association for Computation Linguistics 2021 مقالة

While named entity recognition (NER) from speech has been around as long as NER from written text has, the accuracy of NER from speech has generally been much lower than that of NER from text. The rise in popularity of spoken dialog systems such as S iri or Alexa highlights the need for more accurate NER from speech because NER is a core component for understanding what users said in dialogs. Deployed spoken dialog systems receive user input in the form of automatic speech recognition (ASR) transcripts, and simply applying NER model trained on written text to ASR transcripts often leads to low accuracy because compared to written text, ASR transcripts lack important cues such as punctuation and capitalization. Besides, errors in ASR transcripts also make NER from speech challenging. We propose two models that exploit dialog context and speech pattern clues to extract named entities more accurately from open-domain dialogs in spoken dialog systems. Our results show the benefit of modeling dialog context and speech patterns in two settings: a standard setting with random partition of data and a more realistic but also more difficult setting where many named entities encountered during deployment are unseen during training.

استراتيجيات الاستشارة spoken dialog systems improving named entity نظم الحوار المنطوق صناعة حمض الفوسفور

ChemNER: Fine-Grained Chemistry Named Entity Recognition with Ontology-Guided Distant Supervision

279 - Association for Computation Linguistics 2021 مقالة

Scientific literature analysis needs fine-grained named entity recognition (NER) to provide a wide range of information for scientific discovery. For example, chemistry research needs to study dozens to hundreds of distinct, fine-grained entity types , making consistent and accurate annotation difficult even for crowds of domain experts. On the other hand, domain-specific ontologies and knowledge bases (KBs) can be easily accessed, constructed, or integrated, which makes distant supervision realistic for fine-grained chemistry NER. In distant supervision, training labels are generated by matching mentions in a document with the concepts in the knowledge bases (KBs). However, this kind of KB-matching suffers from two major challenges: incomplete annotation and noisy annotation. We propose ChemNER, an ontology-guided, distantly-supervised method for fine-grained chemistry NER to tackle these challenges. It leverages the chemistry type ontology structure to generate distant labels with novel methods of flexible KB-matching and ontology-guided multi-type disambiguation. It significantly improves the distant label generation for the subsequent sequence labeling model training. We also provide an expert-labeled, chemistry NER dataset with 62 fine-grained chemistry types (e.g., chemical compounds and chemical reactions). Experimental results show that ChemNER is highly effective, outperforming substantially the state-of-the-art NER methods (with .25 absolute F1 score improvement).

إعادة تركيزه حسب الصلة chemistry named entity الكيمياء اسم الكيان صناعة حمض الفوسفور

Transfer Learning for Czech Historical Named Entity Recognition

300 - Association for Computation Linguistics 2021 مقالة

Nowadays, named entity recognition (NER) achieved excellent results on the standard corpora. However, big issues are emerging with a need for an application in a specific domain, because it requires a suitable annotated corpus with adapted NE tag-set . This is particularly evident in the historical document processing field. The main goal of this paper consists of proposing and evaluation of several transfer learning methods to increase the score of the Czech historical NER. We study several information sources, and we use two neural nets for NE modeling and recognition. We employ two corpora for evaluation of our transfer learning methods, namely Czech named entity corpus and Czech historical named entity corpus. We show that BERT representation with fine-tuning and only the simple classifier trained on the union of corpora achieves excellent results.

أسلوب historical named entity كيان تاريخي الاسم صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Improving Distantly-Supervised Named Entity Recognition with Self-Collaborative Denoising Learning

تحسين الاعتراف الكياري المسمى بشكل مسمى مع التعلم التعاوني للانحدار الذاتي

Ask ChatGPT about the research

Read More

suggested questions