Robust Reading Comprehension with Linguistic Constraints via Posterior Regularization

82 0 0.0 ( 0 )

Download Cite

Added by Mantong Zhou

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors Mantong Zhou - Minlie Huang - Xiaoyan Zhu

Computation and Language

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In spite of great advancements of machine reading comprehension (RC), existing RC models are still vulnerable and not robust to different types of adversarial examples. Neural models over-confidently predict wrong answers to semantic different adversarial examples, while over-sensitively predict wrong answers to semantic equivalent adversarial examples. Existing methods which improve the robustness of such neural models merely mitigate one of the two issues but ignore the other. In this paper, we address the over-confidence issue and the over-sensitivity issue existing in current RC models simultaneously with the help of external linguistic knowledge. We first incorporate external knowledge to impose different linguistic constraints (entity constraint, lexical constraint, and predicate constraint), and then regularize RC models through posterior regularization. Linguistic constraints induce more reasonable predictions for both semantic different and semantic equivalent adversarial examples, and posterior regularization provides an effective mechanism to incorporate these constraints. Our method can be applied to any existing neural RC models including state-of-the-art BERT models. Extensive experiments show that our method remarkably improves the robustness of base RC models, and is better to cope with these two issues simultaneously.

rate research

Reading Comprehension Ability Test-A Turing Test for Reading Comprehension

242 - Yuan Miao , Gongqi Lin , Yidan Hu 2019

Reading comprehension is an important ability of human intelligence. Literacy and numeracy are two most essential foundation for people to succeed at study, at work and in life. Reading comprehension ability is a core component of literacy. In most of the education systems, developing reading comprehension ability is compulsory in the curriculum from year one to year 12. It is an indispensable ability in the dissemination of knowledge. With the emerging artificial intelligence, computers start to be able to read and understand like people in some context. They can even read better than human beings for some tasks, but have little clue in other tasks. It will be very beneficial if we can identify the levels of machine comprehension ability, which will direct us on the further improvement. Turing test is a well-known test of the difference between computer intelligence and human intelligence. In order to be able to compare the difference between people reading and machines reading, we proposed a test called (reading) Comprehension Ability Test (CAT).CAT is similar to Turing test, passing of which means we cannot differentiate people from algorithms in term of their comprehension ability. CAT has multiple levels showing the different abilities in reading comprehension, from identifying basic facts, performing inference, to understanding the intent and sentiment.

Computation and Language

Improving Low-resource Reading Comprehension via Cross-lingual Transposition Rethinking

177 - Gaochen Wu , Bin Xu1 , Yuxin Qin 2021

Extractive Reading Comprehension (ERC) has made tremendous advances enabled by the availability of large-scale high-quality ERC training data. Despite of such rapid progress and widespread application, the datasets in languages other than high-resource languages such as English remain scarce. To address this issue, we propose a Cross-Lingual Transposition ReThinking (XLTT) model by modelling existing high-quality extractive reading comprehension datasets in a multilingual environment. To be specific, we present multilingual adaptive attention (MAA) to combine intra-attention and inter-attention to learn more general generalizable semantic and lexical knowledge from each pair of language families. Furthermore, to make full use of existing datasets, we adopt a new training framework to train our model by calculating task-level similarities between each existing dataset and target dataset. The experimental results show that our XLTT model surpasses six baselines on two multilingual ERC benchmarks, especially more effective for low-resource languages with 3.9 and 4.1 average improvement in F1 and EM, respectively.

Computation and Language Artificial Intelligence

Multi-style Generative Reading Comprehension

113 - Kyosuke Nishida , Itsumi Saito , Kosuke Nishida 2019

This study tackles generative reading comprehension (RC), which consists of answering questions based on textual evidence and natural language generation (NLG). We propose a multi-style abstractive summarization model for question answering, called Masque. The proposed model has two key characteristics. First, unlike most studies on RC that have focused on extracting an answer span from the provided passages, our model instead focuses on generating a summary from the question and multiple passages. This serves to cover various answer styles required for real-world applications. Second, whereas previous studies built a specific model for each answer style because of the difficulty of acquiring one general model, our approach learns multi-style answers within a model to improve the NLG capability for all styles involved. This also enables our model to give an answer in the target style. Experiments show that our model achieves state-of-the-art performance on the Q&A task and the Q&A + NLG task of MS MARCO 2.1 and the summary task of NarrativeQA. We observe that the transfer of the style-independent NLG capability to the target style is the key to its success.

Computation and Language

Undersensitivity in Neural Reading Comprehension

139 - Johannes Welbl , Pasquale Minervini , Max Bartolo 2020

Current reading comprehension models generalise well to in-distribution test sets, yet perform poorly on adversarially selected inputs. Most prior work on adversarial inputs studies oversensitivity: semantically invariant text perturbations that cause a models prediction to change when it should not. In this work we focus on the complementary problem: excessive prediction undersensitivity, where input text is meaningfully changed but the models prediction does not, even though it should. We formulate a noisy adversarial attack which searches among semantic variations of the question for which a model erroneously predicts the same answer, and with even higher probability. Despite comprising unanswerable questions, both SQuAD2.0 and NewsQA models are vulnerable to this attack. This indicates that although accurate, models tend to rely on spurious patterns and do not fully consider the information specified in a question. We experiment with data augmentation and adversarial training as defences, and find that both substantially decrease vulnerability to attacks on held out data, as well as held out attack spaces. Addressing undersensitivity also improves results on AddSent and AddOneSent, and models furthermore generalise better when facing train/evaluation distribution mismatch: they are less prone to overly rely on predictive cues present only in the training set, and outperform a conventional model by as much as 10.9% F1.

Computation and Language Machine Learning Machine Learning

The NarrativeQA Reading Comprehension Challenge

109 - Tomav{s} Kov{c}isky , Jonathan Schwarz , Phil Blunsom 2017

Reading comprehension (RC)---in contrast to information retrieval---requires integrating information and reasoning about events, entities, and their relations across a full document. Question answering is conventionally used to assess RC ability, in both artificial agents and children learning to read. However, existing RC datasets and tasks are dominated by questions that can be solved by selecting answers using superficial information (e.g., local context similarity or global term frequency); they thus fail to test for the essential integrative aspect of RC. To encourage progress on deeper comprehension of language, we present a new dataset and set of tasks in which the reader must answer questions about stories by reading entire books or movie scripts. These tasks are designed so that successfully answering their questions requires understanding the underlying narrative rather than relying on shallow pattern matching or salience. We show that although humans solve the tasks easily, standard RC models struggle on the tasks presented here. We provide an analysis of the dataset and the challenges it presents.

Computation and Language Artificial Intelligence Neural and Evolutionary Computing