No Arabic abstract
We evaluate named entity representations of BERT-based NLP models by investigating their robustness to replacements from the same typed class in the input. We highlight that on several tasks while such perturbations are natural, state of the art trained models are surprisingly brittle. The brittleness continues even with the recent entity-aware BERT models. We also try to discern the cause of this non-robustness, considering factors such as tokenization and frequency of occurrence. Then we provide a simple method that ensembles predictions from multiple replacements while jointly modeling the uncertainty of type annotations and label predictions. Experiments on three NLP tasks show that our method enhances robustness and increases accuracy on both natural and adversarial datasets.
Users on Twitter are commonly identified by their profile names. These names are used when directly addressing users on Twitter, are part of their profile page URLs, and can become a trademark for popular accounts, with people referring to celebrities by their real name and their profile name, interchangeably. Twitter, however, has chosen to not permanently link profile names to their corresponding user accounts. In fact, Twitter allows users to change their profile name, and afterwards makes the old profile names available for other users to take. In this paper, we provide a large-scale study of the phenomenon of profile name reuse on Twitter. We show that this phenomenon is not uncommon, investigate the dynamics of profile name reuse, and characterize the accounts that are involved in it. We find that many of these accounts adopt abandoned profile names for questionable purposes, such as spreading malicious content, and using the profile names popularity for search engine optimization. Finally, we show that this problem is not unique to Twitter (as other popular online social networks also release profile names) and argue that the risks involved with profile-name reuse outnumber the advantages provided by this feature.
Taking word sequences as the input, typical named entity recognition (NER) models neglect errors from pre-processing (e.g., tokenization). However, these errors can influence the model performance greatly, especially for noisy texts like tweets. Here, we introduce Neural-Char-CRF, a raw-to-end framework that is more robust to pre-processing errors. It takes raw character sequences as inputs and makes end-to-end predictions. Word embedding and contextualized representation models are further tailored to capture textual signals for each character instead of each word. Our model neither requires the conversion from character sequences to word sequences, nor assumes tokenizer can correctly detect all word boundaries. Moreover, we observe our model performance remains unchanged after replacing tokenization with string matching, which demonstrates its potential to be tokenization-free. Extensive experimental results on two public datasets demonstrate the superiority of our proposed method over the state of the art. The implementations and datasets are made available at: https://github.com/LiyuanLucasLiu/Raw-to-End.
A flaw in QA evaluation is that annotations often only provide one gold answer. Thus, model predictions semantically equivalent to the answer but superficially different are considered incorrect. This work explores mining alias entities from knowledge bases and using them as additional gold answers (i.e., equivalent answers). We incorporate answers for two settings: evaluation with additional answers and model training with equivalent answers. We analyse three QA benchmarks: Natural Questions, TriviaQA, and SQuAD. Answer expansion increases the exact match score on all datasets for evaluation, while incorporating it helps model training over real-world datasets. We ensure the additional answers are valid through a human post hoc evaluation.
Active Galactic Nuclei (AGN) are energetic astrophysical sources powered by accretion onto supermassive black holes in galaxies, and present unique observational signatures that cover the full electromagnetic spectrum over more than twenty orders of magnitude in frequency. The rich phenomenology of AGN has resulted in a large number of different flavours in the literature that now comprise a complex and confusing AGN zoo. It is increasingly clear that these classifications are only partially related to intrinsic differences between AGN, and primarily reflect variations in a relatively small number of astrophysical parameters as well the method by which each class of AGN is selected. Taken together, observations in different electromagnetic bands as well as variations over time provide complementary windows on the physics of different sub-structures in the AGN. In this review, we present an overview of AGN multi-wavelength properties with the aim of painting their big picture through observations in each electromagnetic band from radio to gamma-rays as well as AGN variability. We address what we can learn from each observational method, the impact of selection effects, the physics behind the emission at each wavelength, and the potential for future studies. To conclude we use these observations to piece together the basic architecture of AGN, discuss our current understanding of unification models, and highlight some open questions that present opportunities for future observational and theoretical progress.
Many name tagging approaches use local contextual information with much success, but fail when the local context is ambiguous or limited. We present a new framework to improve name tagging by utilizing local, document-level, and corpus-level contextual information. We retrieve document-level context from other sentences within the same document and corpus-level context from sentences in other topically related documents. We propose a model that learns to incorporate document-level and corpus-level contextual information alongside local contextual information via global attentions, which dynamically weight their respective contextual information, and gating mechanisms, which determine the influence of this information. Extensive experiments on benchmark datasets show the effectiveness of our approach, which achieves state-of-the-art results for Dutch, German, and Spanish on the CoNLL-2002 and CoNLL-2003 datasets.