Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Named Entity Recognition for Entity Linking: What Works and What's Next

اسمه الكيان التعرف على الكيان يربط: ما الذي يعمل وما هو التالي

806 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

يكشف الحدود صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Entity Linking (EL) systems have achieved impressive results on standard benchmarks mainly thanks to the contextualized representations provided by recent pretrained language models. However, such systems still require massive amounts of data -- millions of labeled examples -- to perform at their best, with training times that often exceed several days, especially when limited computational resources are available. In this paper, we look at how Named Entity Recognition (NER) can be exploited to narrow the gap between EL systems trained on high and low amounts of labeled data. More specifically, we show how and to what extent an EL system can benefit from NER to enhance its entity representations, improve candidate selection, select more effective negative samples and enforce hard and soft constraints on its output entities. We release our software -- code and model checkpoints -- at https://github.com/Babelscape/ner4el.

References used

https://aclanthology.org/

rate research

An Exploratory Study on Long Dialogue Summarization: What Works and What's Next

739 - Association for Computation Linguistics 2021 مقالة

Dialogue summarization helps readers capture salient information from long conversations in meetings, interviews, and TV series. However, real-world dialogues pose a great challenge to current summarization models, as the dialogue length typically ex ceeds the input limits imposed by recent transformer-based pre-trained models, and the interactive nature of dialogues makes relevant information more context-dependent and sparsely distributed than news articles. In this work, we perform a comprehensive study on long dialogue summarization by investigating three strategies to deal with the lengthy input problem and locate relevant information: (1) extended transformer models such as Longformer, (2) retrieve-then-summarize pipeline models with several dialogue utterance retrieval methods, and (3) hierarchical dialogue encoding models such as HMNet. Our experimental results on three long dialogue datasets (QMSum, MediaSum, SummScreen) show that the retrieve-then-summarize pipeline models yield the best performance. We also demonstrate that the summary quality can be further improved with a stronger retrieval model and pretraining on proper external summarization datasets.

exploratory study long dialogue summarization long dialogue دراسة استكشافية تلخيص الحوار الطويل حوار طويل صناعة حمض الفوسفور المزيد..

Partially Supervised Named Entity Recognition via the Expected Entity Ratio Loss

438 - Association for Computation Linguistics 2021 مقالة

Abstract We study learning named entity recognizers in the presence of missing entity annotations. We approach this setting as tagging with latent variables and propose a novel loss, the Expected Entity Ratio, to learn models in the presence of syste matically missing tags. We show that our approach is both theoretically sound and empirically useful. Experimentally, we find that it meets or exceeds performance of strong and state-of-the-art baselines across a variety of languages, annotation scenarios, and amounts of labeled data. In particular, we find that it significantly outperforms the previous state-of-the-art methods from Mayhew et al. (2019) and Li et al. (2021) by +12.7 and +2.3 F1 score in a challenging setting with only 1,000 biased annotations, averaged across 7 datasets. We also show that, when combined with our approach, a novel sparse annotation scheme outperforms exhaustive annotation for modest annotation budgets.1

partially supervised named supervised named entity تحت إشراف جزئيا اسمه الكيان المسمى تحت الإشراف صناعة حمض الفوسفور

MasakhaNER: Named Entity Recognition for African Languages

535 - Association for Computation Linguistics 2021 مقالة

Abstract We take a step towards addressing the under- representation of the African continent in NLP research by bringing together different stakeholders to create the first large, publicly available, high-quality dataset for named entity recognition (NER) in ten African languages. We detail the characteristics of these languages to help researchers and practitioners better understand the challenges they pose for NER tasks. We analyze our datasets and conduct an extensive empirical evaluation of state- of-the-art methods across both supervised and transfer learning settings. Finally, we release the data, code, and models to inspire future research on African NLP.1

مجموعات البيانات الإنجليزية الحالية صناعة حمض الفوسفور

Transfer Learning for Czech Historical Named Entity Recognition

328 - Association for Computation Linguistics 2021 مقالة

Nowadays, named entity recognition (NER) achieved excellent results on the standard corpora. However, big issues are emerging with a need for an application in a specific domain, because it requires a suitable annotated corpus with adapted NE tag-set . This is particularly evident in the historical document processing field. The main goal of this paper consists of proposing and evaluation of several transfer learning methods to increase the score of the Czech historical NER. We study several information sources, and we use two neural nets for NE modeling and recognition. We employ two corpora for evaluation of our transfer learning methods, namely Czech named entity corpus and Czech historical named entity corpus. We show that BERT representation with fine-tuning and only the simple classifier trained on the union of corpora achieves excellent results.

أسلوب historical named entity كيان تاريخي الاسم صناعة حمض الفوسفور

Data Augmentation for Cross-Domain Named Entity Recognition

440 - Association for Computation Linguistics 2021 مقالة

Current work in named entity recognition (NER) shows that data augmentation techniques can produce more robust models. However, most existing techniques focus on augmenting in-domain data in low-resource scenarios where annotated data is quite limite d. In this work, we take this research direction to the opposite and study cross-domain data augmentation for the NER task. We investigate the possibility of leveraging data from high-resource domains by projecting it into the low-resource domains. Specifically, we propose a novel neural architecture to transform the data representation from a high-resource to a low-resource domain by learning the patterns (e.g. style, noise, abbreviations, etc.) in the text that differentiate them and a shared feature space where both domains are aligned. We experiment with diverse datasets and show that transforming the data to the low-resource domain representation achieves significant improvements over only using data from high-resource domains.

حقيقي صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Named Entity Recognition for Entity Linking: What Works and What's Next

اسمه الكيان التعرف على الكيان يربط: ما الذي يعمل وما هو التالي

Ask ChatGPT about the research

Read More

suggested questions