Research papers, master and doctoral theses about ner

DreamDrug - A crowdsourced NER dataset for detecting drugs in darknet markets

619 - Association for Computation Linguistics 2021 مقالة

We present DreamDrug, a crowdsourced dataset for detecting mentions of drugs in noisy user-generated item listings from darknet markets. Our dataset contains nearly 15,000 manually annotated drug entities in over 3,500 item listings scraped from the darknet market platform DreamMarket'' in 2017. We also train and evaluate baseline models for detecting these entities, using contextual language models fine-tuned in a few-shot setting and on the full dataset, and examine the effect of pretraining on in-domain unannotated corpora.

crowdsourced ner dataset crowdsourced ner ner dataset مجموعة بيانات Growdsourced Ner growdsourced ner. DataSet ner. صناعة حمض الفوسفور المزيد..

Can images help recognize entities? A study of the role of images for Multimodal NER

672 - Association for Computation Linguistics 2021 مقالة

Multimodal named entity recognition (MNER) requires to bridge the gap between language understanding and visual context. While many multimodal neural techniques have been proposed to incorporate images into the MNER task, the model's ability to lever age multimodal interactions remains poorly understood. In this work, we conduct in-depth analyses of existing multimodal fusion techniques from different perspectives and describe the scenarios where adding information from the image does not always boost performance. We also study the use of captions as a way to enrich the context for MNER. Experiments on three datasets from popular social platforms expose the bottleneck of existing multimodal models and the situations where using captions is beneficial.

recognize entities multimodal multimodal ner التعرف على الكيانات multimodal. multimodal ner. صناعة حمض الفوسفور المزيد..

ComboNER: A Lightweight All-In-One POS Tagger, Dependency Parser and NER

674 - Association for Computation Linguistics 2021 مقالة

The current natural language processing is strongly focused on raising accuracy. The progress comes at a cost of super-heavy models with hundreds of millions or even billions of parameters. However, simple syntactic tasks such as part-of-speech (POS) tagging, dependency parsing or named entity recognition (NER) do not require the largest models to achieve acceptable results. In line with this assumption we try to minimize the size of the model that jointly performs all three tasks. We introduce ComboNER: a lightweight tool, orders of magnitude smaller than state-of-the-art transformers. It is based on pre-trained subword embeddings and recurrent neural network architecture. ComboNER operates on Polish language data. The model has outputs for POS tagging, dependency parsing and NER. Our paper contains some insights from fine-tuning of the model and reports its overall results.

pos tagger dependency parser parser and ner نقاط البيع Tagger. محلل التبعية المحلل والنشر صناعة حمض الفوسفور المزيد..

Mitigating Temporal-Drift: A Simple Approach to Keep NER Models Crisp

927 - Association for Computation Linguistics 2021 مقالة

Performance of neural models for named entity recognition degrades over time, becoming stale. This degradation is due to temporal drift, the change in our target variables' statistical properties over time. This issue is especially problematic for so cial media data, where topics change rapidly. In order to mitigate the problem, data annotation and retraining of models is common. Despite its usefulness, this process is expensive and time-consuming, which motivates new research on efficient model updating. In this paper, we propose an intuitive approach to measure the potential trendiness of tweets and use this metric to select the most informative instances to use for training. We conduct experiments on three state-of-the-art models on the Temporal Twitter Dataset. Our approach shows larger increases in prediction accuracy with less training data than the alternatives, making it an attractive, practical solution.

ner models crisp mitigating temporal-drift models crisp نماذج نير هش تخفيف الانجراف الزمني نماذج هش صناعة حمض الفوسفور المزيد..

Noisy-Labeled NER with Confidence Estimation

831 - Association for Computation Linguistics 2021 مقالة

Recent studies in deep learning have shown significant progress in named entity recognition (NER). However, most existing works assume clean data annotation, while real-world scenarios typically involve a large amount of noises from a variety of sour ces (e.g., pseudo, weak, or distant annotations). This work studies NER under a noisy labeled setting with calibrated confidence estimation. Based on empirical observations of different training dynamics of noisy and clean labels, we propose strategies for estimating confidence scores based on local and global independence assumptions. We partially marginalize out labels of low confidence with a CRF model. We further propose a calibration method for confidence scores based on the structure of entity labels. We integrate our approach into a self-training framework for boosting performance. Experiments in general noisy settings with four languages and distantly labeled settings demonstrate the effectiveness of our method.

noisy-labeled ner confidence estimation ner صاخبة المسمى ner تقدير الثقة نير صناعة حمض الفوسفور المزيد..

Knowledge Distillation for Swedish NER models: A Search for Performance and Efficiency

1060 - Association for Computation Linguistics 2021 مقالة

The current recipe for better model performance within NLP is to increase model size and training data. While it gives us models with increasingly impressive results, it also makes it more difficult to train and deploy state-of-the-art models for NLP due to increasing computational costs. Model compression is a field of research that aims to alleviate this problem. The field encompasses different methods that aim to preserve the performance of a model while decreasing the size of it. One such method is knowledge distillation. In this article, we investigate the effect of knowledge distillation for named entity recognition models in Swedish. We show that while some sequence tagging models benefit from knowledge distillation, not all models do. This prompts us to ask questions about in which situations and for which models knowledge distillation is beneficial. We also reason about the effect of knowledge distillation on computational costs.

swedish ner models swedish ner نماذج نير السويدية سويدية نير صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد