PALRACE: Reading Comprehension Dataset with Human Data and Labeled Rationales

101 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Jiajie Zou

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Jiajie Zou - Yuran Zhang - Peiqing Jin

الحساب واللغة الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Pre-trained language models achieves high performance on machine reading comprehension (MRC) tasks but the results are hard to explain. An appealing approach to make models explainable is to provide rationales for its decision. To facilitate supervised learning of human rationales, here we present PALRACE (Pruned And Labeled RACE), a new MRC dataset with human labeled rationales for 800 passages selected from the RACE dataset. We further classified the question to each passage into 6 types. Each passage was read by at least 26 participants, who labeled their rationales to answer the question. Besides, we conducted a rationale evaluation session in which participants were asked to answering the question solely based on labeled rationales, confirming that the labeled rationales were of high quality and can sufficiently support question answering.

قيم البحث

125 - Weihao Yu , Zihang Jiang , Yanfei Dong 2020

Recent powerful pre-trained language models have achieved remarkable performance on most of the popular datasets for reading comprehension. It is time to introduce more challenging datasets to push the development of this field towards more comprehen sive reasoning of text. In this paper, we introduce a new Reading Comprehension dataset requiring logical reasoning (ReClor) extracted from standardized graduate admission examinations. As earlier studies suggest, human-annotated datasets usually contain biases, which are often exploited by models to achieve high accuracy without truly understanding the text. In order to comprehensively evaluate the logical reasoning ability of models on ReClor, we propose to identify biased data points and separate them into EASY set while the rest as HARD set. Empirical results show that state-of-the-art models have an outstanding ability to capture biases contained in the dataset with high accuracy on EASY set. However, they struggle on HARD set with poor performance near that of random guess, indicating more research is needed to essentially enhance the logical reasoning ability of current models.

الحساب واللغة الذكاء الاصطناعي التعلم الآلي

Evaluating and Characterizing Human Rationales

143 - Samuel Carton , Anirudh Rathore , Chenhao Tan 2020

Two main approaches for evaluating the quality of machine-generated rationales are: 1) using human rationales as a gold standard; and 2) automated metrics based on how rationales affect model behavior. An open question, however, is how human rational es fare with these automatic metrics. Analyzing a variety of datasets and models, we find that human rationales do not necessarily perform well on these metrics. To unpack this finding, we propose improved metrics to account for model-dependent baseline performance. We then propose two methods to further characterize rationale quality, one based on model retraining and one on using fidelity curves to reveal properties such as irrelevance and redundancy. Our work leads to actionable suggestions for evaluating and characterizing rationales.

الحساب واللغة الذكاء الاصطناعي أجهزة الكمبيوتر والمجتمع

CJRC: A Reliable Human-Annotated Benchmark DataSet for Chinese Judicial Reading Comprehension

81 - Xingyi Duan , Baoxin Wang , Ziyue Wang 2019

We present a Chinese judicial reading comprehension (CJRC) dataset which contains approximately 10K documents and almost 50K questions with answers. The documents come from judgment documents and the questions are annotated by law experts. The CJRC d ataset can help researchers extract elements by reading comprehension technology. Element extraction is an important task in the legal field. However, it is difficult to predefine the element types completely due to the diversity of document types and causes of action. By contrast, machine reading comprehension technology can quickly extract elements by answering various questions from the long document. We build two strong baseline models based on BERT and BiDAF. The experimental results show that there is enough space for improvement compared to human annotators.

الحساب واللغة التعلم الآلي

Understanding Human Reading Comprehension with Brain Signals

61 - Ziyi Ye , Xiaohui Xie , Yiqun Liu 2021

Reading comprehension is a complex cognitive process involving many human brain activities. Plenty of works have studied the reading patterns and attention allocation mechanisms in the reading process. However, little is known about what happens in h uman brain during reading comprehension and how we can utilize this information as implicit feedback to facilitate information acquisition performance. With the advances in brain imaging techniques such as EEG, it is possible to collect high-precision brain signals in almost real time. With neuroimaging techniques, we carefully design a lab-based user study to investigate brain activities during reading comprehension. Our findings show that neural responses vary with different types of contents, i.e., contents that can satisfy users information needs and contents that cannot. We suggest that various cognitive activities, e.g., cognitive loading, semantic-thematic understanding, and inferential processing, at the micro-time scale during reading comprehension underpin these neural responses. Inspired by these detectable differences in cognitive activities, we construct supervised learning models based on EEG features for two reading comprehension tasks: answer sentence classification and answer extraction. Results show that it is feasible to improve their performance with brain signals. These findings imply that brain signals are valuable feedback for enhancing human-computer interactions during reading comprehension.

استرجاع المعلومات الذكاء الاصطناعي نظرية المعلومات

Deep Neural Networks Evolve Human-like Attention Distribution during Reading Comprehension

134 - Jiajie Zou , Nai Ding 2021

Attention is a key mechanism for information selection in both biological brains and many state-of-the-art deep neural networks (DNNs). Here, we investigate whether humans and DNNs allocate attention in comparable ways when reading a text passage to subsequently answer a specific question. We analyze 3 transformer-based DNNs that reach human-level performance when trained to perform the reading comprehension task. We find that the DNN attention distribution quantitatively resembles human attention distribution measured by fixation times. Human readers fixate longer on words that are more relevant to the question-answering task, demonstrating that attention is modulated by top-down reading goals, on top of lower-level visual and text features of the stimulus. Further analyses reveal that the attention weights in DNNs are also influenced by both top-down reading goals and lower-level stimulus features, with the shallow layers more strongly influenced by lower-level text features and the deep layers attending more to task-relevant words. Additionally, deep layers attention to task-relevant words gradually emerges when pre-trained DNN models are fine-tuned to perform the reading comprehension task, which coincides with the improvement in task performance. These results demonstrate that DNNs can evolve human-like attention distribution through task optimization, which suggests that human attention during goal-directed reading comprehension is a consequence of task optimization.

الحساب واللغة الذكاء الاصطناعي