Explaining Answers with Entailment Trees

101 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Bhavana Dalvi Mishra

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Bhavana Dalvi - Peter Jansen - Oyvind Tafjord

الحساب واللغة الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Our goal, in the context of open-domain textual question-answering (QA), is to explain answers by not just listing supporting textual evidence (rationales), but also showing how such evidence leads to the answer in a systematic way. If this could be done, new opportunities for understanding and debugging the systems reasoning would become possible. Our approach is to generate explanations in the form of entailment trees, namely a tree of entailment steps from facts that are known, through intermediate conclusions, to the final answer. To train a model with this skill, we created ENTAILMENTBANK, the first dataset to contain multistep entailment trees. At each node in the tree (typically) two or more facts compose together to produce a new conclusion. Given a hypothesis (question + answer), we define three increasingly difficult explanation tasks: generate a valid entailment tree given (a) all relevant sentences (the leaves of the gold entailment tree), (b) all relevant and some irrelevant sentences, or (c) a corpus. We show that a strong language model only partially solves these tasks, and identify several new directions to improve performance. This work is significant as it provides a new type of dataset (multistep entailments) and baselines, offering a new avenue for the community to generate richer, more systematic explanations.

قيم البحث

207 - Tuhin Chakrabarty , Christopher Hidey , Smaranda Muresan 2021

Framing involves the positive or negative presentation of an argument or issue depending on the audience and goal of the speaker (Entman 1983). Differences in lexical framing, the focus of our work, can have large effects on peoples opinions and beli efs. To make progress towards reframing arguments for positive effects, we create a dataset and method for this task. We use a lexical resource for connotations to create a parallel corpus and propose a method for argument reframing that combines controllable text generation (positive connotation) with a post-decoding entailment component (same denotation). Our results show that our method is effective compared to strong baselines along the dimensions of fluency, meaning, and trustworthiness/reduction of fear.

الحساب واللغة الذكاء الاصطناعي

Entailment as Few-Shot Learner

422 - Sinong Wang , Han Fang , Madian Khabsa 2021

Large pre-trained language models (LMs) have demonstrated remarkable ability as few-shot learners. However, their success hinges largely on scaling model parameters to a degree that makes it challenging to train and serve. In this paper, we propose a new approach, named as EFL, that can turn small LMs into better few-shot learners. The key idea of this approach is to reformulate potential NLP task into an entailment one, and then fine-tune the model with as little as 8 examples. We further demonstrate our proposed method can be: (i) naturally combined with an unsupervised contrastive learning-based data augmentation method; (ii) easily extended to multilingual few-shot learning. A systematic evaluation on 18 standard NLP tasks demonstrates that this approach improves the various existing SOTA few-shot learning methods by 12%, and yields competitive few-shot performance with 500 times larger models, such as GPT-3.

الحساب واللغة الذكاء الاصطناعي

Figurative Language in Recognizing Textual Entailment

240 - Tuhin Chakrabarty , Debanjan Ghosh , Adam Poliak 2021

We introduce a collection of recognizing textual entailment (RTE) datasets focused on figurative language. We leverage five existing datasets annotated for a variety of figurative language -- simile, metaphor, and irony -- and frame them into over 12 ,500 RTE examples.We evaluate how well state-of-the-art models trained on popular RTE datasets capture different aspects of figurative language. Our results and analyses indicate that these models might not sufficiently capture figurative language, struggling to perform pragmatic inference and reasoning about world knowledge. Ultimately, our datasets provide a challenging testbed for evaluating RTE models.

الحساب واللغة الذكاء الاصطناعي

e-SNLI-VE: Corrected Visual-Textual Entailment with Natural Language Explanations

96 - Virginie Do , Oana-Maria Camburu , Zeynep Akata 2020

The recently proposed SNLI-VE corpus for recognising visual-textual entailment is a large, real-world dataset for fine-grained multimodal reasoning. However, the automatic way in which SNLI-VE has been assembled (via combining parts of two related da tasets) gives rise to a large number of errors in the labels of this corpus. In this paper, we first present a data collection effort to correct the class with the highest error rate in SNLI-VE. Secondly, we re-evaluate an existing model on the corrected corpus, which we call SNLI-VE-2.0, and provide a quantitative comparison with its performance on the non-corrected corpus. Thirdly, we introduce e-SNLI-VE, which appends human-written natural language explanations to SNLI-VE-2.0. Finally, we train models that learn from these explanations at training time, and output such explanations at testing time.

الحساب واللغة الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Continuous Entailment Patterns for Lexical Inference in Context

484 - Martin Schmitt , Hinrich Schutze 2021

Combining a pretrained language model (PLM) with textual patterns has been shown to help in both zero- and few-shot settings. For zero-shot performance, it makes sense to design patterns that closely resemble the text seen during self-supervised pret raining because the model has never seen anything else. Supervised training allows for more flexibility. If we allow for tokens outside the PLMs vocabulary, patterns can be adapted more flexibly to a PLMs idiosyncrasies. Contrasting patterns where a token can be any continuous vector vs. those where a discrete choice between vocabulary elements has to be made, we call our method CONtinuous pAtterNs (CONAN). We evaluate CONAN on two established benchmarks for lexical inference in context (LIiC) a.k.a. predicate entailment, a challenging natural language understanding task with relatively small training sets. In a direct comparison with discrete patterns, CONAN consistently leads to improved performance, setting a new state of the art. Our experiments give valuable insights into the kind of pattern that enhances a PLMs performance on LIiC and raise important questions regarding our understanding of PLMs using text patterns.

الحساب واللغة الذكاء الاصطناعي