Research papers, master and doctoral theses about تحسين سؤال غير الإنجليزية

GermanQuAD and GermanDPR: Improving Non-English Question Answering and Passage Retrieval

253 - Association for Computation Linguistics 2021 مقالة

A major challenge of research on non-English machine reading for question answering (QA) is the lack of annotated datasets. In this paper, we present GermanQuAD, a dataset of 13,722 extractive question/answer pairs. To improve the reproducibility of the dataset creation approach and foster QA research on other languages, we summarize lessons learned and evaluate reformulation of question/answer pairs as a way to speed up the annotation process. An extractive QA model trained on GermanQuAD significantly outperforms multilingual models and also shows that machine-translated training data cannot fully substitute hand-annotated training data in the target language. Finally, we demonstrate the wide range of applications of GermanQuAD by adapting it to GermanDPR, a training dataset for dense passage retrieval (DPR), and train and evaluate one of the first non-English DPR models.

improving non-english question non-english question answering improving non-english تحسين سؤال غير الإنجليزية الإجابة غير الإنجليزية تحسين غير الإنجليزية صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد