New community

Subscribe to the gold package and get unlimited access to Shamra Academy

What does BERT Learn from Arabic Machine Reading Comprehension Datasets?

ماذا تعلم بيرت من مجموعات بيانات الفهم الآلي للآلة العربية؟

500 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

machine reading comprehension bert learn reading comprehension datasets آلة قراءة الآلة بيرت تعلم قراءة مجموعات البيانات الفهم صناعة حمض الفوسفور

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In machine reading comprehension tasks, a model must extract an answer from the available context given a question and a passage. Recently, transformer-based pre-trained language models have achieved state-of-the-art performance in several natural language processing tasks. However, it is unclear whether such performance reflects true language understanding. In this paper, we propose adversarial examples to probe an Arabic pre-trained language model (AraBERT), leading to a significant performance drop over four Arabic machine reading comprehension datasets. We present a layer-wise analysis for the transformer's hidden states to offer insights into how AraBERT reasons to derive an answer. The experiments indicate that AraBERT relies on superficial cues and keyword matching rather than text understanding. Furthermore, hidden state visualization demonstrates that prediction errors can be recognized from vector representations in earlier layers.

References used

https://aclanthology.org/

rate research

Does Structure Matter? Encoding Documents for Machine Reading Comprehension

346 - Association for Computation Linguistics 2021 مقالة

Machine reading comprehension is a challenging task especially for querying documents with deep and interconnected contexts. Transformer-based methods have shown advanced performances on this task; however, most of them still treat documents as a fla t sequence of tokens. This work proposes a new Transformer-based method that reads a document as tree slices. It contains two modules for identifying more relevant text passage and the best answer span respectively, which are not only jointly trained but also jointly consulted at inference time. Our evaluation results show that our proposed method outperforms several competitive baseline approaches on two datasets from varied domains.

structure matter هيكل مسألة صناعة حمض الفوسفور

Does BERT Learn as Humans Perceive? Understanding Linguistic Styles through Lexica

605 - Association for Computation Linguistics 2021 مقالة

People convey their intention and attitude through linguistic styles of the text that they write. In this study, we investigate lexicon usages across styles throughout two lenses: human perception and machine word importance, since words differ in th e strength of the stylistic cues that they provide. To collect labels of human perception, we curate a new dataset, Hummingbird, on top of benchmarking style datasets. We have crowd workers highlight the representative words in the text that makes them think the text has the following styles: politeness, sentiment, offensiveness, and five emotion types. We then compare these human word labels with word importance derived from a popular fine-tuned style classifier like BERT. Our results show that the BERT often finds content words not relevant to the target style as important words used in style prediction, but humans do not perceive the same way even though for some styles (e.g., positive sentiment and joy) human- and machine-identified words share significant overlap for some styles.

التفكير الشديد learn styles يتعلم أنماط صناعة حمض الفوسفور

Adversarial Training for Machine Reading Comprehension with Virtual Embeddings

541 - Association for Computation Linguistics 2021 مقالة

Adversarial training (AT) as a regularization method has proved its effectiveness on various tasks. Though there are successful applications of AT on some NLP tasks, the distinguishing characteristics of NLP tasks have not been exploited. In this pap er, we aim to apply AT on machine reading comprehension (MRC) tasks. Furthermore, we adapt AT for MRC tasks by proposing a novel adversarial training method called PQAT that perturbs the embedding matrix instead of word vectors. To differentiate the roles of passages and questions, PQAT uses additional virtual P/Q-embedding matrices to gather the global perturbations of words from passages and questions separately. We test the method on a wide range of MRC tasks, including span-based extractive RC and multiple-choice RC. The results show that adversarial training is effective universally, and PQAT further improves the performance.

فورانيا الموازية adversarial training machine reading التدريب الخصم آلة قراءة صناعة حمض الفوسفور

Relying on Discourse Analysis to Answer Complex Questions by Neural Machine Reading Comprehension

315 - Association for Computation Linguistics 2021 مقالة

Machine reading comprehension (MRC) is one of the most challenging tasks in natural language processing domain. Recent state-of-the-art results for MRC have been achieved with the pre-trained language models, such as BERT and its modifications. Despi te the high performance of these models, they still suffer from the inability to retrieve correct answers from the detailed and lengthy passages. In this work, we introduce a novel scheme for incorporating the discourse structure of the text into a self-attention network, and, thus, enrich the embedding obtained from the standard BERT encoder with the additional linguistic knowledge. We also investigate the influence of different types of linguistic information on the model's ability to answer complex questions that require deep understanding of the whole text. Experiments performed on the SQuAD benchmark and more complex question answering datasets have shown that linguistic enhancing boosts the performance of the standard BERT model significantly.

تحديد اللغة الهجومية neural machine reading آلة القراءة العصبية صناعة حمض الفوسفور

What can Neural Referential Form Selectors Learn?

242 - Association for Computation Linguistics 2021 مقالة

Despite achieving encouraging results, neural Referring Expression Generation models are often thought to lack transparency. We probed neural Referential Form Selection (RFS) models to find out to what extent the linguistic features influencing the R E form are learned and captured by state-of-the-art RFS models. The results of 8 probing tasks show that all the defined features were learned to some extent. The probing tasks pertaining to referential status and syntactic position exhibited the highest performance. The lowest performance was achieved by the probing models designed to predict discourse structure properties beyond the sentence level.

form selectors learn referential form selectors selectors learn محددات المشكلات تعلم محدد النماذج المرجعية محددات تعلم صناعة حمض الفوسفور المزيد..

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

What does BERT Learn from Arabic Machine Reading Comprehension Datasets?

ماذا تعلم بيرت من مجموعات بيانات الفهم الآلي للآلة العربية؟

Ask ChatGPT about the research

Read More

suggested questions