New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Achieving Fairness via Post-Processing in Web-Scale Recommender Systems

57 0 0.0 ( 0 )

Download Cite

Added by Preetam Nandy

Publication date 2020

fields Mathematical Statistics Informatics Engineering

and research's language is English

Authors Preetam Nandy - Cyrus Diciccio - Divya Venugopalan

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Building fair recommender systems is a challenging and extremely important area of study due to its immense impact on society. We focus on two commonly accepted notions of fairness for machine learning models powering such recommender systems, namely equality of opportunity and equalized odds. These measures of fairness make sure that equally qualified (or unqualified) candidates are treated equally regardless of their protected attribute status (such as gender or race). In this paper, we propose scalable methods for achieving equality of opportunity and equalized odds in rankings in the presence of position bias, which commonly plagues data generated from recommendation systems. Our algorithms are model agnostic in the sense that they depend only on the final scores provided by a model, making them easily applicable to virtually all web-scale recommender systems. We conduct extensive simulations as well as real-world experiments to show the efficacy of our approach.

rate research

Neural networks for post-processing ensemble weather forecasts

226 - Stephan Rasp , Sebastian Lerch 2018

Ensemble weather predictions require statistical post-processing of systematic errors to obtain reliable and accurate probabilistic forecasts. Traditionally, this is accomplished with distributional regression models in which the parameters of a predictive distribution are estimated from a training period. We propose a flexible alternative based on neural networks that can incorporate nonlinear relationships between arbitrary predictor variables and forecast distribution parameters that are automatically learned in a data-driven way rather than requiring pre-specified link functions. In a case study of 2-meter temperature forecasts at surface stations in Germany, the neural network approach significantly outperforms benchmark post-processing methods while being computationally more affordable. Key components to this improvement are the use of auxiliary predictor variables and station-specific information with the help of embeddings. Furthermore, the trained neural network can be used to gain insight into the importance of meteorological variables thereby challenging the notion of neural networks as uninterpretable black boxes. Our approach can easily be extended to other statistical post-processing and forecasting problems. We anticipate that recent advances in deep learning combined with the ever-increasing amounts of model and observation data will transform the post-processing of numerical weather forecasts in the coming decade.

Machine Learning Machine Learning Atmospheric and Oceanic Physics

Trace your sources in large-scale data: one ring to find them all

176 - Alexander Bottcher , Wieland Brendel , Bernhard Englitz 2018

An important preprocessing step in most data analysis pipelines aims to extract a small set of sources that explain most of the data. Currently used algorithms for blind source separation (BSS), however, often fail to extract the desired sources and need extensive cross-validation. In contrast, their rarely used probabilistic counterparts can get away with little cross-validation and are more accurate and reliable but no simple and scalable implementations are available. Here we present a novel probabilistic BSS framework (DECOMPOSE) that can be flexibly adjusted to the data, is extensible and easy to use, adapts to individual sources and handles large-scale data through algorithmic efficiency. DECOMPOSE encompasses and generalises many traditional BSS algorithms such as PCA, ICA and NMF and we demonstrate substantial improvements in accuracy and robustness on artificial and real data.

Machine Learning Machine Learning Methodology

Quantifying Availability and Discovery in Recommender Systems via Stochastic Reachability

180 - Mihaela Curmei , Sarah Dean , Benjamin Recht 2021

In this work, we consider how preference models in interactive recommendation systems determine the availability of content and users opportunities for discovery. We propose an evaluation procedure based on stochastic reachability to quantify the maximum probability of recommending a target piece of content to an user for a set of allowable strategic modifications. This framework allows us to compute an upper bound on the likelihood of recommendation with minimal assumptions about user behavior. Stochastic reachability can be used to detect biases in the availability of content and diagnose limitations in the opportunities for discovery granted to users. We show that this metric can be computed efficiently as a convex program for a variety of practical settings, and further argue that reachability is not inherently at odds with accuracy. We demonstrate evaluations of recommendation algorithms trained on large datasets of explicit and implicit ratings. Our results illustrate how preference models, selection rules, and user interventions impact reachability and how these effects can be distributed unevenly.

Information Retrieval Machine Learning Machine Learning

EasyQuant: Post-training Quantization via Scale Optimization

101 - Di Wu , Qi Tang , Yongle Zhao 2020

The 8 bits quantization has been widely applied to accelerate network inference in various deep learning applications. There are two kinds of quantization methods, training-based quantization and post-training quantization. Training-based approach suffers from a cumbersome training process, while post-training quantization may lead to unacceptable accuracy drop. In this paper, we present an efficient and simple post-training method via scale optimization, named EasyQuant (EQ),that could obtain comparable accuracy with the training-based method.Specifically, we first alternately optimize scales of weights and activations for all layers target at convolutional outputs to further obtain the high quantization precision. Then, we lower down bit width to INT7 both for weights and activations, and adopt INT16 intermediate storage and integer Winograd convolution implementation to accelerate inference.Experimental results on various computer vision tasks show that EQ outperforms the TensorRT method and can achieve near INT8 accuracy in 7 bits width post-training.

Computer Vision and Pattern Recognition Machine Learning Image and Video Processing

Covariance-engaged Classification of Sets via Linear Programming

202 - Zhao Ren , Sungkyu Jung , Xingye Qiao 2020

Set classification aims to classify a set of observations as a whole, as opposed to classifying individual observations separately. To formally understand the unfamiliar concept of binary set classification, we first investigate the optimal decision rule under the normal distribution, which utilizes the empirical covariance of the set to be classified. We show that the number of observations in the set plays a critical role in bounding the Bayes risk. Under this framework, we further propose new methods of set classification. For the case where only a few parameters of the model drive the difference between two classes, we propose a computationally-efficient approach to parameter estimation using linear programming, leading to the Covariance-engaged LInear Programming Set (CLIPS) classifier. Its theoretical properties are investigated for both independent case and various (short-range and long-range dependent) time series structures among observations within each set. The convergence rates of estimation errors and risk of the CLIPS classifier are established to show that having multiple observations in a set leads to faster convergence rates, compared to the standard classification situation in which there is only one observation in the set. The applicable domains in which the CLIPS performs better than competitors are highlighted in a comprehensive simulation study. Finally, we illustrate the usefulness of the proposed methods in classification of real image data in histopathology.

Machine Learning Machine Learning Methodology

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Achieving Fairness via Post-Processing in Web-Scale Recommender Systems

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions