Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

SS-BERT: Mitigating Identity Terms Bias in Toxic Comment Classification by Utilising the Notion of Subjectivity and Identity Terms

69 0 0.0 ( 0 )

Download Cite

Added by Zhixue Zhao

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Zhixue Zhao - Ziqi Zhang - Frank Hopfgartner

Computation and Language Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Toxic comment classification models are often found biased toward identity terms which are terms characterizing a specific group of people such as Muslim and black. Such bias is commonly reflected in false-positive predictions, i.e. non-toxic comments with identity terms. In this work, we propose a novel approach to tackle such bias in toxic comment classification, leveraging the notion of subjectivity level of a comment and the presence of identity terms. We hypothesize that when a comment is made about a group of people that is characterized by an identity term, the likelihood of that comment being toxic is associated with the subjectivity level of the comment, i.e. the extent to which the comment conveys personal feelings and opinions. Building upon the BERT model, we propose a new structure that is able to leverage these features, and thoroughly evaluate our model on 4 datasets of varying sizes and representing different social media platforms. The results show that our model can consistently outperform BERT and a SOTA model devised to address identity term bias in a different way, with a maximum improvement in F1 of 2.43% and 1.91% respectively.

rate research

The Authors Matter: Understanding and Mitigating Implicit Bias in Deep Text Classification

116 - Haochen Liu , Wei Jin , Hamid Karimi 2021

It is evident that deep text classification models trained on human data could be biased. In particular, they produce biased outcomes for texts that explicitly include identity terms of certain demographic groups. We refer to this type of bias as explicit bias, which has been extensively studied. However, deep text classification models can also produce biased outcomes for texts written by authors of certain demographic groups. We refer to such bias as implicit bias of which we still have a rather limited understanding. In this paper, we first demonstrate that implicit bias exists in different text classification tasks for different demographic groups. Then, we build a learning-based interpretation method to deepen our knowledge of implicit bias. Specifically, we verify that classifiers learn to make predictions based on language features that are related to the demographic attributes of the authors. Next, we propose a framework Debiased-TC to train deep text classifiers to make predictions on the right features and consequently mitigate implicit bias. We conduct extensive experiments on three real-world datasets. The results show that the text classification models trained under our proposed framework outperform traditional models significantly in terms of fairness, and also slightly in terms of classification performance.

Computation and Language

Mitigating Biases in Toxic Language Detection through Invariant Rationalization

87 - Yung-Sung Chuang , Mingye Gao , Hongyin Luo 2021

Automatic detection of toxic language plays an essential role in protecting social media users, especially minority groups, from verbal abuse. However, biases toward some attributes, including gender, race, and dialect, exist in most training datasets for toxicity detection. The biases make the learned models unfair and can even exacerbate the marginalization of people. Considering that current debiasing methods for general natural language understanding tasks cannot effectively mitigate the biases in the toxicity detectors, we propose to use invariant rationalization (InvRat), a game-theoretic framework consisting of a rationale generator and a predictor, to rule out the spurious correlation of certain syntactic patterns (e.g., identity mentions, dialect) to toxicity labels. We empirically show that our method yields lower false positive rate in both lexical and dialectal attributes than previous debiasing methods.

Computation and Language

Scaling of the 1-halo terms with bias

262 - L. Raul Abramo , Ir`ene Balm`es , Fabien Lacasa 2015

In the Halo Model, galaxies are hosted by dark matter halos, while the halos themselves are biased tracers of the underlying matter distribution. Measurements of galaxy correlation functions include contributions both from galaxies in different halos, and from galaxies in the same halo (the so-called 1-halo terms). We show that, for highly biased tracers, the 1-halo term of the power spectrum obeys a steep scaling relation in terms of bias. We also show that the 1-halo term of the trispectrum has a steep scaling with bias. The steepness of these scaling relations is such that the 1-halo terms can become key contributions to the $n$-point correlation functions, even at large scales. We interpret these results through analytical arguments and semi-analytical calculations in terms of the statistical properties of halos.

Cosmology and Nongalactic Astrophysics

Exploring Semantic Capacity of Terms

184 - Jie Huang , Zilong Wang , Kevin Chen-Chuan Chang 2020

We introduce and study semantic capacity of terms. For example, the semantic capacity of artificial intelligence is higher than that of linear regression since artificial intelligence possesses a broader meaning scope. Understanding semantic capacity of terms will help many downstream tasks in natural language processing. For this purpose, we propose a two-step model to investigate semantic capacity of terms, which takes a large text corpus as input and can evaluate semantic capacity of terms if the text corpus can provide enough co-occurrence information of terms. Extensive experiments in three fields demonstrate the effectiveness and rationality of our model compared with well-designed baselines and human-level evaluations.

Computation and Language Artificial Intelligence

Your fairness may vary: Group fairness of pretrained language models in toxic text classification

220 - Ioana Baldini , Dennis Wei , Karthikeyan Natesan Ramamurthy 2021

We study the performance-fairness trade-off in more than a dozen fine-tuned LMs for toxic text classification. We empirically show that no blanket statement can be made with respect to the bias of large versus regular versus compressed models. Moreover, we find that focusing on fairness-agnostic performance metrics can lead to models with varied fairness characteristics.

Computation and Language Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

SS-BERT: Mitigating Identity Terms Bias in Toxic Comment Classification by Utilising the Notion of Subjectivity and Identity Terms

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions