Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Eliciting Bias in Question Answering Models through Ambiguity

إظهار التحيز في سؤال الرد النماذج من خلال الغموض

337 0 0 0.0 ( 0 )

Download Cite

Added by Association for Computation Linguistics مقالة

Publication date 2021

fields Artificial Intelligence

and research's language is English

Created by Shamra Editor

visit our facebook page

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Question answering (QA) models use retriever and reader systems to answer questions. Reliance on training data by QA systems can amplify or reflect inequity through their responses. Many QA models, such as those for the SQuAD dataset, are trained and tested on a subset of Wikipedia articles which encode their own biases and also reproduce real-world inequality. Understanding how training data affects bias in QA systems can inform methods to mitigate inequity. We develop two sets of questions for closed and open domain questions respectively, which use ambiguous questions to probe QA models for bias. We feed three deep-learning-based QA systems with our question sets and evaluate responses for bias via the metrics. Using our metrics, we find that open-domain QA models amplify biases more than their closed-domain counterparts and propose that biases in the retriever surface more readily due to greater freedom of choice.

References used

https://aclanthology.org/

rate research

Robust Question Answering Through Sub-part Alignment

595 - Association for Computation Linguistics 2021 مقالة

Current textual question answering (QA) models achieve strong performance on in-domain test sets, but often do so by fitting surface-level patterns, so they fail to generalize to out-of-distribution settings. To make a more robust and understandable QA system, we model question answering as an alignment problem. We decompose both the question and context into smaller units based on off-the-shelf semantic representations (here, semantic roles), and align the question to a subgraph of the context in order to find the answer. We formulate our model as a structured SVM, with alignment scores computed via BERT, and we can train end-to-end despite using beam search for approximate inference. Our use of explicit alignments allows us to explore a set of constraints with which we can prohibit certain types of bad model behavior arising in cross-domain settings. Furthermore, by investigating differences in scores across different potential answers, we can seek to understand what particular aspects of the input lead the model to choose the answer without relying on post-hoc explanation techniques. We train our model on SQuAD v1.1 and test it on several adversarial and out-of-domain datasets. The results show that our model is more robust than the standard BERT QA model, and constraints derived from alignment scores allow us to effectively trade off coverage and accuracy.

حالة غريبة sub-part alignment robust question answering المحاذاة الفرعية سؤال قوي الرد صناعة حمض الفوسفور

Winnowing Knowledge for Multi-choice Question Answering

284 - Association for Computation Linguistics 2021 مقالة

We tackle multi-choice question answering. Acquiring related commonsense knowledge to the question and options facilitates the recognition of the correct answer. However, the current reasoning models suffer from the noises in the retrieved knowledge. In this paper, we propose a novel encoding method which is able to conduct interception and soft filtering. This contributes to the harvesting and absorption of representative information with less interference from noises. We experiment on CommonsenseQA. Experimental results illustrate that our method yields substantial and consistent improvements compared to the strong Bert, RoBERTa and Albert-based baselines.

multi-choice question answering tackle multi-choice question استجابة سؤال متعددة الاختيار معالجة سؤال متعدد الاختيار صناعة حمض الفوسفور

Semantic Categorization of Social Knowledge for Commonsense Question Answering

533 - Association for Computation Linguistics 2021 مقالة

Large pre-trained language models (PLMs) have led to great success on various commonsense question answering (QA) tasks in an end-to-end fashion. However, little attention has been paid to what commonsense knowledge is needed to deeply characterize t hese QA tasks. In this work, we proposed to categorize the semantics needed for these tasks using the SocialIQA as an example. Building upon our labeled social knowledge categories dataset on top of SocialIQA, we further train neural QA models to incorporate such social knowledge categories and relation information from a knowledge base. Unlike previous work, we observe our models with semantic categorizations of social knowledge can achieve comparable performance with a relatively simple model and smaller size compared to other complex approaches.

commonsense question answering commonsense question الرد على سؤال المنطقي سؤال المنطقي صناعة حمض الفوسفور

Can Question Generation Debias Question Answering Models? A Case Study on Question--Context Lexical Overlap

420 - Association for Computation Linguistics 2021 مقالة

Question answering (QA) models for reading comprehension have been demonstrated to exploit unintended dataset biases such as question--context lexical overlap. This hinders QA models from generalizing to under-represented samples such as questions wi th low lexical overlap. Question generation (QG), a method for augmenting QA datasets, can be a solution for such performance degradation if QG can properly debias QA datasets. However, we discover that recent neural QG models are biased towards generating questions with high lexical overlap, which can amplify the dataset bias. Moreover, our analysis reveals that data augmentation with these QG models frequently impairs the performance on questions with low lexical overlap, while improving that on questions with high lexical overlap. To address this problem, we use a synonym replacement-based approach to augment questions with low lexical overlap. We demonstrate that the proposed data augmentation approach is simple yet effective to mitigate the degradation problem with only 70k synthetic examples.

context lexical overlap lexical overlap low lexical overlap التداخل المعجمي السياق التداخل المعجمي التداخل المعجمي المنخفض صناعة حمض الفوسفور المزيد..

Uncovering Implicit Gender Bias in Narratives through Commonsense Inference

507 - Association for Computation Linguistics 2021 مقالة

Pre-trained language models learn socially harmful biases from their training corpora, and may repeat these biases when used for generation. We study gender biases associated with the protagonist in model-generated stories. Such biases may be express ed either explicitly (women can't park'') or implicitly (e.g. an unsolicited male character guides her into a parking space). We focus on implicit biases, and use a commonsense reasoning engine to uncover them. Specifically, we infer and analyze the protagonist's motivations, attributes, mental states, and implications on others. Our findings regarding implicit biases are in line with prior work that studied explicit biases, for example showing that female characters' portrayal is centered around appearance, while male figures' focus on intellect.

bias in narratives implicit gender bias التحيز في الروايات التحيز بين الجنسين الضمني صناعة حمض الفوسفور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Eliciting Bias in Question Answering Models through Ambiguity

إظهار التحيز في سؤال الرد النماذج من خلال الغموض

Ask ChatGPT about the research

Read More

suggested questions