يعرض هذا العمل رواية خط أنابيب QA مفتوحة من أربعة مراحل R2-D2 (رتبة مرتين، اقرأ مرتين).يتكون خط الأنابيب من المسترد والمرور Reranker وقارئ استخراجي وقارئ عام وآلية تعزز التنبؤ النهائي من جميع مكونات النظام.نوضح قوتها عبر ثلاث مجموعات بيانات QA المجال المفتوحةيوضح تحليلنا على ما يلي: (1) الجمع بين القارئ الاستخراجي والمؤدي يحقق تحسينات مطلقة تصل إلى 5 مباراة محددة، وعلى الأقل ضعف كفاءة مثل الفرع الخلفي من نفس النماذج مع معلمات مختلفة، (2) القارئ الاستخراجي مع أقليمكن للممعلمات مطابقة أداء قارئ التوليد في مجموعات بيانات QA الاستخراجية.
This work presents a novel four-stage open-domain QA pipeline R2-D2 (Rank twice, reaD twice). The pipeline is composed of a retriever, passage reranker, extractive reader, generative reader and a mechanism that aggregates the final prediction from all system's components. We demonstrate its strength across three open-domain QA datasets: NaturalQuestions, TriviaQA and EfficientQA, surpassing state-of-the-art on the first two. Our analysis demonstrates that: (i) combining extractive and generative reader yields absolute improvements up to 5 exact match and it is at least twice as effective as the posterior averaging ensemble of the same models with different parameters, (ii) the extractive reader with fewer parameters can match the performance of the generative reader on extractive QA datasets.
References used
https://aclanthology.org/
Numerical reasoning skills are essential for complex question answering (CQA) over text. It requires opertaions including counting, comparison, addition and subtraction. A successful approach to CQA on text, Neural Module Networks (NMNs), follows the
Dense neural text retrieval has achieved promising results on open-domain Question Answering (QA), where latent representations of questions and passages are exploited for maximum inner product search in the retrieval process. However, current dense
Open-domain question answering aims at locating the answers to user-generated questions in massive collections of documents. Retriever-readers and knowledge graph approaches are two big families of solutions to this task. A retriever-reader first app
We introduce a new dataset for Question Rewriting in Conversational Context (QReCC), which contains 14K conversations with 80K question-answer pairs. The task in QReCC is to find answers to conversational questions within a collection of 10M web page
In open-domain question answering, dense passage retrieval has become a new paradigm to retrieve relevant passages for finding answers. Typically, the dual-encoder architecture is adopted to learn dense representations of questions and passages for s