إن الفهم القراءة الآلي (MRC) هو مهمة NLP الصعبة التي يتطلبها التعامل بعناية مع جميع الحبيبات اللغوية من Word، الجملة إلى المرور.بالنسبة إلى MRC الاستخراجية، تم عرض فترة الإجابة في الغالب عن طريق الأدلة الرئيسية الوحدات اللغوية، حيث إنها جملة في معظم الحالات.ومع ذلك، اكتشفنا مؤخرا أن الجمل قد لا تكون محددة بوضوح في العديد من اللغات إلى النطاقات المختلفة، بحيث يؤدي ذلك إلى ما يسمى بمشكلة غموض وحدة الموقع ونتيجة لذلك، مما يجعل من الصعب على النموذج لتحديد الجملة التي تحتوي على تمديد الإجابة بالضبط عندماالجملة نفسها لم يتم تعريفها بوضوح على الإطلاق.مع أخذ اللغة الصينية كدراسة حالة، فإننا نوضح وتحليل هذه الظاهرة اللغوية واقترح قارئ مقابلة مع التنافق الصريح بالإجمال لتخفيف مثل هذه المشكلة.يساعد قارئنا المقترح في النهاية في تحقيق أحدث حالة جديدة في مؤشر MRC الصيني ويظهر إمكانات كبيرة في التعامل مع لغات أخرى.
Machine reading comprehension (MRC) is a challenging NLP task for it requires to carefully deal with all linguistic granularities from word, sentence to passage. For extractive MRC, the answer span has been shown mostly determined by key evidence linguistic units, in which it is a sentence in most cases. However, we recently discovered that sentences may not be clearly defined in many languages to different extents, so that this causes so-called location unit ambiguity problem and as a result makes it difficult for the model to determine which sentence exactly contains the answer span when sentence itself has not been clearly defined at all. Taking Chinese language as a case study, we explain and analyze such a linguistic phenomenon and correspondingly propose a reader with Explicit Span-Sentence Predication to alleviate such a problem. Our proposed reader eventually helps achieve a new state-of-the-art on Chinese MRC benchmark and shows great potential in dealing with other languages.
References used
https://aclanthology.org/
Temporal language grounding in videos aims to localize the temporal span relevant to the given query sentence. Previous methods treat it either as a boundary regression task or a span extraction task. This paper will formulate temporal language groun
Gender bias in word embeddings gradually becomes a vivid research field in recent years. Most studies in this field aim at measurement and debiasing methods with English as the target language. This paper investigates gender bias in static word embed
Machine Reading Comprehension (MRC), which requires a machine to answer questions given the relevant documents, is an important way to test machines' ability to understand human language. Multiple-choice MRC is one of the most studied tasks in MRC du
With the recent breakthrough of deep learning technologies, research on machine reading comprehension (MRC) has attracted much attention and found its versatile applications in many use cases. MRC is an important natural language processing (NLP) tas
Implicit event argument extraction (EAE) is a crucial document-level information extraction task that aims to identify event arguments beyond the sentence level. Despite many efforts for this task, the lack of enough training data has long impeded th