No Arabic abstract
Given a binary prediction problem, which performance metric should the classifier optimize? We address this question by formalizing the problem of Metric Elicitation. The goal of metric elicitation is to discover the performance metric of a practitioner, which reflects her innate rewards (costs) for correct (incorrect) classification. In particular, we focus on eliciting binary classification performance metrics from pairwise feedback, where a practitioner is queried to provide relative preference between two classifiers. By exploiting key geometric properties of the space of confusion matrices, we obtain provably query efficient algorithms for eliciting linear and linear-fractional performance metrics. We further show that our method is robust to feedback and finite sample noise.
What is a fair performance metric? We consider the choice of fairness metrics through the lens of metric elicitation -- a principled framework for selecting performance metrics that best reflect implicit preferences. The use of metric elicitation enables a practitioner to tune the performance and fairness metrics to the task, context, and population at hand. Specifically, we propose a novel strategy to elicit group-fair performance metrics for multiclass classification problems with multiple sensitive groups that also includes selecting the trade-off between predictive performance and fairness violation. The proposed elicitation strategy requires only relative preference feedback and is robust to both finite sample and feedback noise.
Metric elicitation is a recent framework for eliciting performance metrics that best reflect implicit user preferences. This framework enables a practitioner to adjust the performance metrics based on the application, context, and population at hand. However, available elicitation strategies have been limited to linear (or fractional-linear) functions of predictive rates. In this paper, we develop an approach to elicit from a wider range of complex multiclass metrics defined by quadratic functions of rates by exploiting their local linear structure. We apply this strategy to elicit quadratic metrics for group-based fairness, and also discuss how it can be generalized to higher-order polynomials. Our elicitation strategies require only relative preference feedback and are robust to both feedback and finite sample noise.
We consider the problem of ranking $n$ players from partial pairwise comparison data under the Bradley-Terry-Luce model. For the first time in the literature, the minimax rate of this ranking problem is derived with respect to the Kendalls tau distance that measures the difference between two rank vectors by counting the number of
As pairwise ranking becomes broadly employed for elections, sports competitions, recommendations, and so on, attackers have strong motivation and incentives to manipulate the ranking list. They could inject malicious comparisons into the training data to fool the victim. Such a technique is called poisoning attack in regression and classification tasks. In this paper, to the best of our knowledge, we initiate the first systematic investigation of data poisoning attacks on pairwise ranking algorithms, which can be formalized as the dynamic and static games between the ranker and the attacker and can be modeled as certain kinds of integer programming problems. To break the computational hurdle of the underlying integer programming problems, we reformulate them into the distributionally robust optimization (DRO) problems, which are computationally tractable. Based on such DRO formulations, we propose two efficient poisoning attack algorithms and establish the associated theoretical guarantees. The effectiveness of the suggested poisoning attack strategies is demonstrated by a series of toy simulations and several real data experiments. These experimental results show that the proposed methods can significantly reduce the performance of the ranker in the sense that the correlation between the true ranking list and the aggregated results can be decreased dramatically.
Both the median-based classifier and the quantile-based classifier are useful for discriminating high-dimensional data with heavy-tailed or skewed inputs. But these methods are restricted as they assign equal weight to each variable in an unregularized way. The ensemble quantile classifier is a more flexible regularized classifier that provides better performance with high-dimensional data, asymmetric data or when there are many irrelevant extraneous inputs. The improved performance is demonstrated by a simulation study as well as an application to text categorization. It is proven that the estimated parameters of the ensemble quantile classifier consistently estimate the minimal population loss under suitable general model assumptions. It is also shown that the ensemble quantile classifier is Bayes optimal under suitable assumptions with asymmetric Laplace distribution inputs.