FlowQA: Grasping Flow in History for Conversational Machine Comprehension

54 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Hsin-Yuan Huang

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Hsin-Yuan Huang - Eunsol Choi - Wen-tau Yih

الحساب واللغة الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Conversational machine comprehension requires the understanding of the conversation history, such as previous question/answer pairs, the document context, and the current question. To enable traditional, single-turn models to encode the history comprehensively, we introduce Flow, a mechanism that can incorporate intermediate representations generated during the process of answering previous questions, through an alternating parallel processing structure. Compared to approaches that concatenate previous questions/answers as input, Flow integrates the latent semantics of the conversation history more deeply. Our model, FlowQA, shows superior performance on two recently proposed conversational challenges (+7.2% F1 on CoQA and +4.0% on QuAC). The effectiveness of Flow also shows in other tasks. By reducing sequential instruction understanding to conversational machine comprehension, FlowQA outperforms the best models on all three domains in SCONE, with +1.8% to +4.4% improvement in accuracy.

قيم البحث

101 - Yiming Cui , Wei-Nan Zhang , Wanxiang Che 2021

Achieving human-level performance on some of Machine Reading Comprehension (MRC) datasets is no longer challenging with the help of powerful Pre-trained Language Models (PLMs). However, the internal mechanism of these artifacts still remains unclear, placing an obstacle for further understanding these models. This paper focuses on conducting a series of analytical experiments to examine the relations between the multi-head self-attention and the final performance, trying to analyze the potential explainability in PLM-based MRC models. We perform quantitative analyses on SQuAD (English) and CMRC 2018 (Chinese), two span-extraction MRC datasets, on top of BERT, ALBERT, and ELECTRA in various aspects. We discover that {em passage-to-question} and {em passage understanding} attentions are the most important ones, showing strong correlations to the final performance than other parts. Through visualizations and case studies, we also observe several general findings on the attention maps, which could be helpful to understand how these models solve the questions.

الحساب واللغة الذكاء الاصطناعي

Reference Knowledgeable Network for Machine Reading Comprehension

99 - Yilin Zhao , Zhuosheng Zhang , Hai Zhao 2020

Multi-choice Machine Reading Comprehension (MRC) as a challenge requires model to select the most appropriate answer from a set of candidates given passage and question. Most of the existing researches focus on the modeling of the task datasets witho ut explicitly referring to external fine-grained knowledge sources, which is supposed to greatly make up the deficiency of the given passage. Thus we propose a novel reference-based knowledge enhancement model called Reference Knowledgeable Network (RekNet), which refines critical information from the passage and quote explicit knowledge in necessity. In detail, RekNet refines fine-grained critical information and defines it as Reference Span, then quotes explicit knowledge quadruples by the co-occurrence information of Reference Span and candidates. The proposed RekNet is evaluated on three multi-choice MRC benchmarks: RACE, DREAM and Cosmos QA, which shows consistent and remarkable performance improvement with observable statistical significance level over strong baselines.

الحساب واللغة الذكاء الاصطناعي

Open-Retrieval Conversational Machine Reading

151 - Yifan Gao , Jingjing Li , Michael R. Lyu 2021

In conversational machine reading, systems need to interpret natural language rules, answer high-level questions such as May I qualify for VA health care benefits?, and ask follow-up clarification questions whose answer is necessary to answer the ori ginal question. However, existing works assume the rule text is provided for each user question, which neglects the essential retrieval step in real scenarios. In this work, we propose and investigate an open-retrieval setting of conversational machine reading. In the open-retrieval setting, the relevant rule texts are unknown so that a system needs to retrieve question-relevant evidence from a collection of rule texts, and answer users high-level questions according to multiple retrieved rule texts in a conversational manner. We propose MUDERN, a Multi-passage Discourse-aware Entailment Reasoning Network which extracts conditions in the rule texts through discourse segmentation, conducts multi-passage entailment reasoning to answer user questions directly, or asks clarification follow-up questions to inquiry more information. On our created OR-ShARC dataset, MUDERN achieves the state-of-the-art performance, outperforming existing single-passage conversational machine reading models as well as a new multi-passage conversational machine reading baseline by a large margin. In addition, we conduct in-depth analyses to provide new insights into this new setting and our model.

الحساب واللغة الذكاء الاصطناعي التعلم الآلي

Smoothing Dialogue States for Open Conversational Machine Reading

119 - Zhuosheng Zhang , Siru Ouyang , Hai Zhao 2021

Conversational machine reading (CMR) requires machines to communicate with humans through multi-turn interactions between two salient dialogue states of decision making and question generation processes. In open CMR settings, as the more realistic sc enario, the retrieved background knowledge would be noisy, which results in severe challenges in the information transmission. Existing studies commonly train independent or pipeline systems for the two subtasks. However, those methods are trivial by using hard-label decisions to activate question generation, which eventually hinders the model performance. In this work, we propose an effective gating strategy by smoothing the two dialogue states in only one decoder and bridge decision making and question generation to provide a richer dialogue state reference. Experiments on the OR-ShARC dataset show the effectiveness of our method, which achieves new state-of-the-art results.

الحساب واللغة الذكاء الاصطناعي تفاعل الإنسان والحاسوب

A Simple but Effective Method to Incorporate Multi-turn Context with BERT for Conversational Machine Comprehension

107 - Yasuhito Ohsugi , Itsumi Saito , Kyosuke Nishida 2019

Conversational machine comprehension (CMC) requires understanding the context of multi-turn dialogue. Using BERT, a pre-training language model, has been successful for single-turn machine comprehension, while modeling multiple turns of question answ ering with BERT has not been established because BERT has a limit on the number and the length of input sequences. In this paper, we propose a simple but effective method with BERT for CMC. Our method uses BERT to encode a paragraph independently conditioned with each question and each answer in a multi-turn context. Then, the method predicts an answer on the basis of the paragraph representations encoded with BERT. The experiments with representative CMC datasets, QuAC and CoQA, show that our method outperformed recently published methods (+0.8 F1 on QuAC and +2.1 F1 on CoQA). In addition, we conducted a detailed analysis of the effects of the number and types of dialogue history on the accuracy of CMC, and we found that the gold answer history, which may not be given in an actual conversation, contributed to the model performance most on both datasets.

الحساب واللغة