نقترح طريقة بسيطة لتوليد سؤال متعدد اللغات والإجابة على أزواج على نطاق واسع من خلال استخدام نموذج عام واحد.يمكن استخدام هذه العينات الاصطناعية لتحسين الأداء الصفر لقطة من نماذج QA متعددة اللغات على اللغات المستهدفة.يتطلب تدريبنا المتعدد المهام المقترح للنموذج الإداري فقط عينات التدريب المسمى باللغة الإنجليزية، مما يؤدي إلى إزالة الحاجة إلى مثل هذه العينات باللغات المستهدفة، مما يجعلها تنطبق على لغات أخرى بكثير من تلك التي تحتوي على البيانات المسمى.تشير التقييمات البشرية إلى أن غالبية مثل هذه العينات صحيحة وناصمة.تظهر النتائج التجريبية أن نهجنا المقترح يمكن أن يحقق مكاسب كبيرة في DataSet Xquad، مما يقلل من الفجوة بين الصفر بالرصاص والأداء الخاضع للإشراف على نماذج QA أصغر بلغات مختلفة.
We propose a simple method to generate multilingual question and answer pairs on a large scale through the use of a single generative model. These synthetic samples can be used to improve the zero-shot performance of multilingual QA models on target languages. Our proposed multi-task training of the generative model only requires labeled training samples in English, thus removing the need for such samples in the target languages, making it applicable to far more languages than those with labeled data. Human evaluations indicate the majority of such samples are grammatically correct and sensible. Experimental results show our proposed approach can achieve large gains on the XQuAD dataset, reducing the gap between zero-shot and supervised performance of smaller QA models on various languages.
References used
https://aclanthology.org/
Coupled with the availability of large scale datasets, deep learning architectures have enabled rapid progress on the Question Answering task. However, most of those datasets are in English, and the performances of state-of-the-art multilingual model
Adapting word order from one language to another is a key problem in cross-lingual structured prediction. Current sentence encoders (e.g., RNN, Transformer with position embeddings) are usually word order sensitive. Even with uniform word form repres
Multilingual question answering over knowledge graph (KGQA) aims to derive answers from a knowledge graph (KG) for questions in multiple languages. To be widely applicable, we focus on its zero-shot transfer setting. That is, we can only access train
This paper studies zero-shot cross-lingual transfer of vision-language models. Specifically, we focus on multilingual text-to-video search and propose a Transformer-based model that learns contextual multilingual multimodal embeddings. Under a zero-s
How can we generate concise explanations for multi-hop Reading Comprehension (RC)? The current strategies of identifying supporting sentences can be seen as an extractive question-focused summarization of the input text. However, these extractive exp