FARF: A Fair and Adaptive Random Forests Classifier

83 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Wenbin Zhang

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Wenbin Zhang - Albert Bifet - Xiangliang Zhang

التعلم الآلي الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

As Artificial Intelligence (AI) is used in more applications, the need to consider and mitigate biases from the learned models has followed. Most works in developing fair learning algorithms focus on the offline setting. However, in many real-world applications data comes in an online fashion and needs to be processed on the fly. Moreover, in practical application, there is a trade-off between accuracy and fairness that needs to be accounted for, but current methods often have multiple hyperparameters with non-trivial interaction to achieve fairness. In this paper, we propose a flexible ensemble algorithm for fair decision-making in the more challenging context of evolving online settings. This algorithm, called FARF (Fair and Adaptive Random Forests), is based on using online component classifiers and updating them according to the current distribution, that also accounts for fairness and a single hyperparameters that alters fairness-accuracy balance. Experiments on real-world discriminated data streams demonstrate the utility of FARF.

قيم البحث

114 - Ronan Perry , Ronak Mehta , Richard Guo 2019

Information-theoretic quantities, such as conditional entropy and mutual information, are critical data summaries for quantifying uncertainty. Current widely used approaches for computing such quantities rely on nearest neighbor methods and exhibit b oth strong performance and theoretical guarantees in certain simple scenarios. However, existing approaches fail in high-dimensional settings and when different features are measured on different scales.We propose decision forest-based adaptive nearest neighbor estimators and show that they are able to effectively estimate posterior probabilities, conditional entropies, and mutual information even in the aforementioned settings.We provide an extensive study of efficacy for classification and posterior probability estimation, and prove certain forest-based approaches to be consistent estimators of the true posteriors and derived information-theoretic quantities under certain assumptions. In a real-world connectome application, we quantify the uncertainty about neuron type given various cellular features in the Drosophila larva mushroom body, a key challenge for modern neuroscience.

التعلم الآلي التعلم الالي

Adversarial Random Forest Classifier for Automated Game Design

90 - Thomas Maurer , Matthew Guzdial 2021

Autonomous game design, generating games algorithmically, has been a longtime goal within the technical games research field. However, existing autonomous game design systems have relied in large part on human-authoring for game design knowledge, suc h as fitness functions in search-based methods. In this paper, we describe an experiment to attempt to learn a human-like fitness function for autonomous game design in an adversarial manner. While our experimental work did not meet our expectations, we present an analysis of our system and results that we hope will be informative to future autonomous game design research.

التعلم الآلي الذكاء الاصطناعي

Classifier Chains: A Review and Perspectives

108 - Jesse Read , Bernhard Pfahringer , Geoff Holmes 2019

The family of methods collectively known as classifier chains has become a popular approach to multi-label learning problems. This approach involves linking together off-the-shelf binary classifiers in a chain structure, such that class label predict ions become features for other classifiers. Such methods have proved flexible and effective and have obtained state-of-the-art empirical performance across many datasets and multi-label evaluation metrics. This performance led to further studies of how exactly it works, and how it could be improved, and in the recent decade numerous studies have explored classifier chains mechanisms on a theoretical level, and many improvements have been made to the training and inference procedures, such that this method remains among the state-of-the-art options for multi-label learning. Given this past and ongoing interest, which covers a broad range of applications and research themes, the goal of this work is to provide a review of classifier chains, a survey of the techniques and extensions provided in the literature, as well as perspectives for this approach in the domain of multi-label classification in the future. We conclude positively, with a number of recommendations for researchers and practitioners, as well as outlining a number of areas for future research.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

When are Deep Networks really better than Random Forests at small sample sizes?

120 - Haoyin Xu , Michael Ainsworth , Yu-Chung Peng 2021

Random forests (RF) and deep networks (DN) are two of the most popular machine learning methods in the current scientific literature and yield differing levels of performance on different data modalities. We wish to further explore and establish the conditions and domains in which each approach excels, particularly in the context of sample size and feature dimension. To address these issues, we tested the performance of these approaches across tabular, image, and audio settings using varying model parameters and architectures. Our focus is on datasets with at most 10,000 samples, which represent a large fraction of scientific and biomedical datasets. In general, we found RF to excel at tabular and structured data (image and audio) with small sample sizes, whereas DN performed better on structured data with larger sample sizes. Although we plan to continue updating this technical report in the coming months, we believe the current preliminary results may be of interest to others.

التعلم الآلي الذكاء الاصطناعي

Predict Future Sales using Ensembled Random Forests

69 - Yuwei Zhang , Xin Wu , Chenyang Gu 2019

This is a method report for the Kaggle data competition Predict future sales. In this paper, we propose a rather simple approach to future sales predicting based on feature engineering, Random Forest Regressor and ensemble learning. Its performance t urned out to exceed many of the conventional methods and get final score 0.88186, representing root mean squared error. As of this writing, our model ranked 5th on the leaderboard. (till 8.5.2018)

التعلم الآلي