Fairness and Robustness in Invariant Learning: A Case Study in Toxicity Classification

84 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Robert Adragna

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Robert Adragna - Elliot Creager - David Madras

التعلم الآلي الذكاء الاصطناعي الحساب واللغة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Robustness is of central importance in machine learning and has given rise to the fields of domain generalization and invariant learning, which are concerned with improving performance on a test distribution distinct from but related to the training distribution. In light of recent work suggesting an intimate connection between fairness and robustness, we investigate whether algorithms from robust ML can be used to improve the fairness of classifiers that are trained on biased data and tested on unbiased data. We apply Invariant Risk Minimization (IRM), a domain generalization algorithm that employs a causal discovery inspired method to find robust predictors, to the task of fairly predicting the toxicity of internet comments. We show that IRM achieves better out-of-distribution accuracy and fairness than Empirical Risk Minimization (ERM) methods, and analyze both the difficulties that arise when applying IRM in practice and the conditions under which IRM will likely be effective in this scenario. We hope that this work will inspire further studies of how robust machine learning methods relate to algorithmic fairness.

قيم البحث

130 - Sahaj Garg , Vincent Perot , Nicole Limtiaco 2018

In this paper, we study counterfactual fairness in text classification, which asks the question: How would the prediction change if the sensitive attribute referenced in the example were different? Toxicity classifiers demonstrate a counterfactual fa irness issue by predicting that Some people are gay is toxic while Some people are straight is nontoxic. We offer a metric, counterfactual token fairness (CTF), for measuring this particular form of fairness in text classifiers, and describe its relationship with group fairness. Further, we offer three approaches, blindness, counterfactual augmentation, and counterfactual logit pairing (CLP), for optimizing counterfactual token fairness during training, bridging the robustness and fairness literature. Empirically, we find that blindness and CLP address counterfactual token fairness. The methods do not harm classifier performance, and have varying tradeoffs with group fairness. These approaches, both for measurement and optimization, provide a new path forward for addressing fairness concerns in text classification.

التعلم الآلي التعلم الالي

Fairness-aware Classification: Criterion, Convexity, and Bounds

115 - Yongkai Wu , Lu Zhang , Xintao Wu 2018

Fairness-aware classification is receiving increasing attention in the machine learning fields. Recently research proposes to formulate the fairness-aware classification as constrained optimization problems. However, several limitations exist in prev ious works due to the lack of a theoretical framework for guiding the formulation. In this paper, we propose a general framework for learning fair classifiers which addresses previous limitations. The framework formulates various commonly-used fairness metrics as convex constraints that can be directly incorporated into classic classification models. Within the framework, we propose a constraint-free criterion on the training data which ensures that any classifier learned from the data is fair. We also derive the constraints which ensure that the real fairness metric is satisfied when surrogate functions are used to achieve convexity. Our framework can be used to for formulating fairness-aware classification with fairness guarantee and computational efficiency. The experiments using real-world datasets demonstrate our theoretical results and show the effectiveness of proposed framework and methods.

التعلم الآلي الذكاء الاصطناعي أجهزة الكمبيوتر والمجتمع

A Reputation Mechanism Is All You Need: Collaborative Fairness and Adversarial Robustness in Federated Learning

228 - Xinyi Xu , Lingjuan Lyu 2020

Federated learning (FL) is an emerging practical framework for effective and scalable machine learning among multiple participants, such as end users, organizations and companies. However, most existing FL or distributed learning frameworks have not well addressed two important issues together: collaborative fairness and adversarial robustness (e.g. free-riders and malicious participants). In conventional FL, all participants receive the global model (equal rewards), which might be unfair to the high-contributing participants. Furthermore, due to the lack of a safeguard mechanism, free-riders or malicious adversaries could game the system to access the global model for free or to sabotage it. In this paper, we propose a novel Robust and Fair Federated Learning (RFFL) framework to achieve collaborative fairness and adversarial robustness simultaneously via a reputation mechanism. RFFL maintains a reputation for each participant by examining their contributions via their uploaded gradients (using vector similarity) and thus identifies non-contributing or malicious participants to be removed. Our approach differentiates itself by not requiring any auxiliary/validation dataset. Extensive experiments on benchmark datasets show that RFFL can achieve high fairness and is very robust to different types of adversaries while achieving competitive predictive accuracy.

التعلم الآلي التعلم الالي

Disentangling Neural Architectures and Weights: A Case Study in Supervised Classification

232 - Nicolo Colombo , Yang Gao 2020

The history of deep learning has shown that human-designed problem-specific networks can greatly improve the classification performance of general neural models. In most practical cases, however, choosing the optimal architecture for a given task rem ains a challenging problem. Recent architecture-search methods are able to automatically build neural models with strong performance but fail to fully appreciate the interaction between neural architecture and weights. This work investigates the problem of disentangling the role of the neural structure and its edge weights, by showing that well-trained architectures may not need any link-specific fine-tuning of the weights. We compare the performance of such weight-free networks (in our case these are binary networks with {0, 1}-valued weights) with random, weight-agnostic, pruned and standard fully connected networks. To find the optimal weight-agnostic network, we use a novel and computationally efficient method that translates the hard architecture-search problem into a feasible optimization problem.More specifically, we look at the optimal task-specific architectures as the optimal configuration of binary networks with {0, 1}-valued weights, which can be found through an approximate gradient descent strategy. Theoretical convergence guarantees of the proposed algorithm are obtained by bounding the error in the gradient approximation and its practical performance is evaluated on two real-world data sets. For measuring the structural similarities between different architectures, we use a novel spectral approach that allows us to underline the intrinsic differences between real-valued networks and weight-free architectures.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي

A Maximal Correlation Approach to Imposing Fairness in Machine Learning

169 - Joshua Lee , Yuheng Bu , Prasanna Sattigeri 2020

As machine learning algorithms grow in popularity and diversify to many industries, ethical and legal concerns regarding their fairness have become increasingly relevant. We explore the problem of algorithmic fairness, taking an information-theoretic view. The maximal correlation framework is introduced for expressing fairness constraints and shown to be capable of being used to derive regularizers that enforce independence and separation-based fairness criteria, which admit optimization algorithms for both discrete and continuous variables which are more computationally efficient than existing algorithms. We show that these algorithms provide smooth performance-fairness tradeoff curves and perform competitively with state-of-the-art methods on both discrete datasets (COMPAS, Adult) and continuous datasets (Communities and Crimes).

التعلم الآلي الذكاء الاصطناعي نظرية المعلومات