Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term Importance Estimation and Neural Query Rewriting

77 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Jheng-Hong Yang

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Sheng-Chieh Lin - Jheng-Hong Yang - Rodrigo Nogueira

الحساب واللغة الذكاء الاصطناعي استرجاع المعلومات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Conversational search plays a vital role in conversational information seeking. As queries in information seeking dialogues are ambiguous for traditional ad-hoc information retrieval (IR) systems due to the coreference and omission resolution problems inherent in natural language dialogue, resolving these ambiguities is crucial. In this paper, we tackle conversational passage retrieval (ConvPR), an important component of conversational search, by addressing query ambiguities with query reformulation integrated into a multi-stage ad-hoc IR system. Specifically, we propose two conversational query reformulation (CQR) methods: (1) term importance estimation and (2) neural query rewriting. For the former, we expand conversational queries using important terms extracted from the conversational context with frequency-based signals. For the latter, we reformulate conversational queries into natural, standalone, human-understandable queries with a pretrained sequence-tosequence model. Detailed analyses of the two CQR methods are provided quantitatively and qualitatively, explaining their advantages, disadvantages, and distinct behaviors. Moreover, to leverage the strengths of both CQR methods, we propose combining their output with reciprocal rank fusion, yielding state-of-the-art retrieval effectiveness, 30% improvement in terms of NDCG@3 compared to the best submission of TREC CAsT 2019.

قيم البحث

106 - Svitlana Vakulenko , Nikos Voskarides , Zhucheng Tu 2021

Conversational passage retrieval relies on question rewriting to modify the original question so that it no longer depends on the conversation history. Several methods for question rewriting have recently been proposed, but they were compared under d ifferent retrieval pipelines. We bridge this gap by thoroughly evaluating those question rewriting methods on the TREC CAsT 2019 and 2020 datasets under the same retrieval pipeline. We analyze the effect of different types of question rewriting methods on retrieval performance and show that by combining question rewriting methods of different types we can achieve state-of-the-art performance on both datasets.

استرجاع المعلومات الحساب واللغة

Leveraging Query Resolution and Reading Comprehension for Conversational Passage Retrieval

98 - Svitlana Vakulenko , Nikos Voskarides , Zhucheng Tu 2021

This paper describes the participation of UvA.ILPS group at the TREC CAsT 2020 track. Our passage retrieval pipeline consists of (i) an initial retrieval module that uses BM25, and (ii) a re-ranking module that combines the score of a BERT ranking mo del with the score of a machine comprehension model adjusted for passage retrieval. An important challenge in conversational passage retrieval is that queries are often under-specified. Thus, we perform query resolution, that is, add missing context from the conversation history to the current turn query using QuReTeC, a term classification query resolution model. We show that our best automatic and manual runs outperform the corresponding median runs by a large margin.

استرجاع المعلومات الحساب واللغة

Joint Passage Ranking for Diverse Multi-Answer Retrieval

110 - Sewon Min , Kenton Lee , Ming-Wei Chang 2021

We study multi-answer retrieval, an under-explored problem that requires retrieving passages to cover multiple distinct answers for a given question. This task requires joint modeling of retrieved passages, as models should not repeatedly retrieve pa ssages containing the same answer at the cost of missing a different valid answer. Prior work focusing on single-answer retrieval is limited as it cannot reason about the set of passages jointly. In this paper, we introduce JPR, a joint passage retrieval model focusing on reranking. To model the joint probability of the retrieved passages, JPR makes use of an autoregressive reranker that selects a sequence of passages, equipped with novel training and decoding algorithms. Compared to prior approaches, JPR achieves significantly better answer coverage on three multi-answer datasets. When combined with downstream question answering, the improved retrieval enables larger answer generation models since they need to consider fewer passages, establishing a new state-of-the-art.

الحساب واللغة الذكاء الاصطناعي

Personalized Query Rewriting in Conversational AI Agents

84 - Alireza Roshan-Ghias , Clint Solomon Mathialagan , Pragaash Ponnusamy 2020

Spoken language understanding (SLU) systems in conversational AI agents often experience errors in the form of misrecognitions by automatic speech recognition (ASR) or semantic gaps in natural language understanding (NLU). These errors easily transla te to user frustrations, particularly so in recurrent events e.g. regularly toggling an appliance, calling a frequent contact, etc. In this work, we propose a query rewriting approach by leveraging users historically successful interactions as a form of memory. We present a neural retrieval model and a pointer-generator network with hierarchical attention and show that they perform significantly better at the query rewriting task with the aforementioned user memories than without. We also highlight how our approach with the proposed models leverages the structural and semantic diversity in ASRs output towards recovering users intents.

الذكاء الاصطناعي الحساب واللغة التعلم الآلي

NeuTral Rewriter: A Rule-Based and Neural Approach to Automatic Rewriting into Gender-Neutral Alternatives

62 - Eva Vanmassenhove , Chris Emmery , Dimitar Shterionov 2021

Recent years have seen an increasing need for gender-neutral and inclusive language. Within the field of NLP, there are various mono- and bilingual use cases where gender inclusive language is appropriate, if not preferred due to ambiguity or uncerta inty in terms of the gender of referents. In this work, we present a rule-based and a neural approach to gender-neutral rewriting for English along with manually curated synthetic data (WinoBias+) and natural data (OpenSubtitles and Reddit) benchmarks. A detailed manual and automatic evaluation highlights how our NeuTral Rewriter, trained on data generated by the rule-based approach, obtains word error rates (WER) below 0.18% on synthetic, in-domain and out-domain test sets.

الحساب واللغة الذكاء الاصطناعي