Will it Unblend?

315 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yuval Pinter

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yuval Pinter - Cassandra L. Jacobs - Jacob Eisenstein

الحساب واللغة الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Natural language processing systems often struggle with out-of-vocabulary (OOV) terms, which do not appear in training data. Blends, such as innoventor, are one particularly challenging class of OOV, as they are formed by fusing together two or more bases that relate to the intended meaning in unpredictable manners and degrees. In this work, we run experiments on a novel dataset of English OOV blends to quantify the difficulty of interpreting the meanings of blends by large-scale contextual language models such as BERT. We first show that BERTs processing of these blends does not fully access the component meanings, leaving their contextual representations semantically impoverished. We find this is mostly due to the loss of characters resulting from blend formation. Then, we assess how easily different models can recognize the structure and recover the origin of blends, and find that context-aware embedding systems outperform character-level and context-free embeddings, although their results are still far from satisfactory.

قيم البحث

236 - Dzmitry Bahdanau , Shikhar Murty , Michael Noukhovitch 2018

Numerous models for grounded language understanding have been recently proposed, including (i) generic models that can be easily adapted to any given task and (ii) intuitively appealing modular models that require background knowledge to be instantia ted. We compare both types of models in how much they lend themselves to a particular form of systematic generalization. Using a synthetic VQA test, we evaluate which models are capable of reasoning about all possible object pairs after training on only a small subset of them. Our findings show that the generalization of modular models is much more systematic and that it is highly sensitive to the module layout, i.e. to how exactly the modules are connected. We furthermore investigate if modular models that generalize well could be made more end-to-end by learning their layout and parametrization. We find that end-to-end methods from prior work often learn inappropriate layouts or parametrizations that do not facilitate systematic generalization. Our results suggest that, in addition to modularity, systematic generalization in language understanding may require explicit regularizers or priors.

الحساب واللغة الذكاء الاصطناعي

Dont Take It Literally: An Edit-Invariant Sequence Loss for Text Generation

82 - Guangyi Liu , Zichao Yang , Tianhua Tao 2021

Neural text generation models are typically trained by maximizing log-likelihood with the sequence cross entropy loss, which encourages an exact token-by-token match between a target sequence with a generated sequence. Such training objective is sub- optimal when the target sequence not perfect, e.g., when the target sequence is corrupted with noises, or when only weak sequence supervision is available. To address this challenge, we propose a novel Edit-Invariant Sequence Loss (EISL), which computes the matching loss of a target n-gram with all n-grams in the generated sequence. EISL draws inspirations from convolutional networks (ConvNets) which are shift-invariant to images, hence is robust to the shift of n-grams to tolerate edits in the target sequences. Moreover, the computation of EISL is essentially a convolution operation with target n-grams as kernels, which is easy to implement with existing libraries. To demonstrate the effectiveness of EISL, we conduct experiments on three tasks: machine translation with noisy target sequences, unsupervised text style transfer, and non-autoregressive machine translation. Experimental results show our method significantly outperforms cross entropy loss on these three tasks.

الحساب واللغة الذكاء الاصطناعي

The promise of Gaia and how it will influence stellar ages

363 - Carla Cacciari 2008

The Gaia space project, planned for launch in 2011, is one of the ESA cornerstone missions, and will provide astrometric, photometric and spectroscopic data of very high quality for about one billion stars brighter than V=20. This will allow to reach an unprecedented level of information and knowledge on several of the most fundamental astrophysical issues, such as mapping of the Milky Way, stellar physics (classification and parameterization), Galactic kinematics and dynamics, study of the resolved stellar populations in the Local Group, distance scale and age of the Universe, dark matter distribution (potential tracers), reference frame (quasars, astrometry), planet detection, fundamental physics, Solar physics, Solar system science. I will present a description of the instrument and its main characteristics, and discuss a few specific science cases where Gaia data promise to contribute fundamental improvement within the scope of this Symposium.

Dont Read Too Much into It: Adaptive Computation for Open-Domain Question Answering

109 - Yuxiang Wu , Sebastian Riedel , Pasquale Minervini 2020

Most approaches to Open-Domain Question Answering consist of a light-weight retriever that selects a set of candidate passages, and a computationally expensive reader that examines the passages to identify the correct answer. Previous works have show n that as the number of retrieved passages increases, so does the performance of the reader. However, they assume all retrieved passages are of equal importance and allocate the same amount of computation to them, leading to a substantial increase in computational cost. To reduce this cost, we propose the use of adaptive computation to control the computational budget allocated for the passages to be read. We first introduce a technique operating on individual passages in isolation which relies on anytime prediction and a per-layer estimation of an early exit probability. We then introduce SkylineBuilder, an approach for dynamically deciding on which passage to allocate computation at each step, based on a resource allocation policy trained via reinforcement learning. Our results on SQuAD-Open show that adaptive computation with global prioritisation improves over several strong static and adaptive methods, leading to a 4.3x reduction in computation while retaining 95% performance of the full model.

الحساب واللغة الذكاء الاصطناعي التعلم الآلي

Build it Break it Fix it for Dialogue Safety: Robustness from Adversarial Human Attack

320 - Emily Dinan , Samuel Humeau , Bharath Chintagunta 2019

The detection of offensive language in the context of a dialogue has become an increasingly important application of natural language processing. The detection of trolls in public forums (Galan-Garcia et al., 2016), and the deployment of chatbots in the public domain (Wolf et al., 2017) are two examples that show the necessity of guarding against adversarially offensive behavior on the part of humans. In this work, we develop a training scheme for a model to become robust to such human attacks by an iterative build it, break it, fix it strategy with humans and models in the loop. In detailed experiments we show this approach is considerably more robust than previous systems. Further, we show that offensive language used within a conversation critically depends on the dialogue context, and cannot be viewed as a single sentence offensive detection task as in most previous work. Our newly collected tasks and methods will be made open source and publicly available.

الحساب واللغة