Do you want to publish a course? Click here

Zero-shot Entity Linking with Efficient Long Range Sequence Modeling

74   0   0.0 ( 0 )
 Added by Zonghai Yao
 Publication date 2020
and research's language is English




Ask ChatGPT about the research

This paper considers the problem of zero-shot entity linking, in which a link in the test time may not present in training. Following the prevailing BERT-based research efforts, we find a simple yet effective way is to expand the long-range sequence modeling. Unlike many previous methods, our method does not require expensive pre-training of BERT with long position embedding. Instead, we propose an efficient position embeddings initialization method called Embedding-repeat, which initializes larger position embeddings based on BERT-Base. On Wikias zero-shot EL dataset, our method improves the SOTA from 76.06% to 79.08%, and for its long data, the corresponding improvement is from 74.57% to 82.14%. Our experiments suggest the effectiveness of long-range sequence modeling without retraining the BERT model.



rate research

Read More

We present the zero-shot entity linking task, where mentions must be linked to unseen entities without in-domain labeled data. The goal is to enable robust transfer to highly specialized domains, and so no metadata or alias tables are assumed. In this setting, entities are only identified by text descriptions, and models must rely strictly on language understanding to resolve the new entities. First, we show that strong reading comprehension models pre-trained on large unlabeled data can be used to generalize to unseen entities. Second, we propose a simple and effective adaptive pre-training strategy, which we term domain-adaptive pre-training (DAP), to address the domain shift problem associated with linking unseen entities in a new domain. We present experiments on a new dataset that we construct for this task and show that DAP improves over strong pre-training baselines, including BERT. The data and code are available at https://github.com/lajanugen/zeshel.
Entity linking -- the task of identifying references in free text to relevant knowledge base representations -- often focuses on single languages. We consider multilingual entity linking, where a single model is trained to link references to same-language knowledge bases in several languages. We propose a neural ranker architecture, which leverages multilingual transformer representations of text to be easily applied to a multilingual setting. We then explore how a neural ranker trained in one language (e.g. English) transfers to an unseen language (e.g. Chinese), and find that while there is a consistent but not large drop in performance. How can this drop in performance be alleviated? We explore adding an adversarial objective to force our model to learn language-invariant representations. We find that using this approach improves recall in several datasets, often matching the in-language performance, thus alleviating some of the performance loss occurring from zero-shot transfer.
Cross-language entity linking grounds mentions in multiple languages to a single-language knowledge base. We propose a neural ranking architecture for this task that uses multilingual BERT representations of the mention and the context in a neural network. We find that the multilingual ability of BERT leads to robust performance in monolingual and multilingual settings. Furthermore, we explore zero-shot language transfer and find surprisingly robust performance. We investigate the zero-shot degradation and find that it can be partially mitigated by a proposed auxiliary training objective, but that the remaining error can best be attributed to domain shift rather than language transfer.
We present ELQ, a fast end-to-end entity linking model for questions, which uses a biencoder to jointly perform mention detection and linking in one pass. Evaluated on WebQSP and GraphQuestions with extended annotations that cover multiple entities per question, ELQ outperforms the previous state of the art by a large margin of +12.7% and +19.6% F1, respectively. With a very fast inference time (1.57 examples/s on a single CPU), ELQ can be useful for downstream question answering systems. In a proof-of-concept experiment, we demonstrate that using ELQ significantly improves the downstream QA performance of GraphRetriever (arXiv:1911.03868). Code and data available at https://github.com/facebookresearch/BLINK/tree/master/elq
Existing state of the art neural entity linking models employ attention-based bag-of-words context model and pre-trained entity embeddings bootstrapped from word embeddings to assess topic level context compatibility. However, the latent entity type information in the immediate context of the mention is neglected, which causes the models often link mentions to incorrect entities with incorrect type. To tackle this problem, we propose to inject latent entity type information into the entity embeddings based on pre-trained BERT. In addition, we integrate a BERT-based entity similarity score into the local context model of a state-of-the-art model to better capture latent entity type information. Our model significantly outperforms the state-of-the-art entity linking models on standard benchmark (AIDA-CoNLL). Detailed experiment analysis demonstrates that our model corrects most of the type errors produced by the direct baseline.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا