No Arabic abstract
Machine learning techniques can be useful in applications such as credit approval and college admission. However, to be classified more favorably in such contexts, an agent may decide to strategically withhold some of her features, such as bad test scores. This is a missing data problem with a twist: which data is missing {em depends on the chosen classifier}, because the specific classifier is what may create the incentive to withhold certain feature values. We address the problem of training classifiers that are robust to this behavior. We design three classification methods: {sc Mincut}, {sc Hill-Climbing} ({sc HC}) and Incentive-Compatible Logistic Regression ({sc IC-LR}). We show that {sc Mincut} is optimal when the true distribution of data is fully known. However, it can produce complex decision boundaries, and hence be prone to overfitting in some cases. Based on a characterization of truthful classifiers (i.e., those that give no incentive to strategically hide features), we devise a simpler alternative called {sc HC} which consists of a hierarchical ensemble of out-of-the-box classifiers, trained using a specialized hill-climbing procedure which we show to be convergent. For several reasons, {sc Mincut} and {sc HC} are not effective in utilizing a large number of complementarily informative features. To this end, we present {sc IC-LR}, a modification of Logistic Regression that removes the incentive to strategically drop features. We also show that our algorithms perform well in experiments on real-world data sets, and present insights into their relative performance in different settings.
Strategic classification regards the problem of learning in settings where users can strategically modify their features to improve outcomes. This setting applies broadly and has received much recent attention. But despite its practical significance, work in this space has so far been predominantly theoretical. In this paper we present a learning framework for strategic classification that is practical. Our approach directly minimizes the strategic empirical risk, achieved by differentiating through the strategic response of users. This provides flexibility that allows us to extend beyond the original problem formulation and towards more realistic learning scenarios. A series of experiments demonstrates the effectiveness of our approach on various learning settings.
Standard approaches to group-based notions of fairness, such as emph{parity} and emph{equalized odds}, try to equalize absolute measures of performance across known groups (based on race, gender, etc.). Consequently, a group that is inherently harder to classify may hold back the performance on other groups; and no guarantees can be provided for unforeseen groups. Instead, we propose a fairness notion whose guarantee, on each group $g$ in a class $mathcal{G}$, is relative to the performance of the best classifier on $g$. We apply this notion to broad classes of groups, in particular, where (a) $mathcal{G}$ consists of all possible groups (subsets) in the data, and (b) $mathcal{G}$ is more streamlined. For the first setting, which is akin to groups being completely unknown, we devise the {sc PF} (Proportional Fairness) classifier, which guarantees, on any possible group $g$, an accuracy that is proportional to that of the optimal classifier for $g$, scaled by the relative size of $g$ in the data set. Due to including all possible groups, some of which could be too complex to be relevant, the worst-case theoretical guarantees here have to be proportionally weaker for smaller subsets. For the second setting, we devise the {sc BeFair} (Best-effort Fair) framework which seeks an accuracy, on every $g in mathcal{G}$, which approximates that of the optimal classifier on $g$, independent of the size of $g$. Aiming for such a guarantee results in a non-convex problem, and we design novel techniques to get around this difficulty when $mathcal{G}$ is the set of linear hypotheses. We test our algorithms on real-world data sets, and present interesting comparative insights on their performance.
We consider classification and regression tasks where we have missing data and assume that the (clean) data resides in a low rank subspace. Finding a hidden subspace is known to be computationally hard. Nevertheless, using a non-proper formulation we give an efficient agnostic algorithm that classifies as good as the best linear classifier coupled with the best low-dimensional subspace in which the data resides. A direct implication is that our algorithm can linearly (and non-linearly through kernels) classify provably as well as the best classifier that has access to the full data.
Classic mechanism design often assumes that a bidders action is restricted to report a type or a signal, possibly untruthfully. In todays digital economy, bidders are holding increasing amount of private information about the auctioned items. And due to legal or ethical concerns, they would demand to reveal partial but truthful information, as opposed to report untrue signal or misinformation. To accommodate such bidder behaviors in auction design, we propose and study a novel mechanism design setup where each bidder holds two kinds of information: (1) private emph{value type}, which can be misreported; (2) private emph{information variable}, which the bidder may want to conceal or partially reveal, but importantly, emph{not} to misreport. We show that in this new setup, it is still possible to design mechanisms that are both emph{Incentive and Information Compatible} (IIC). We develop two different black-box transformations, which convert any mechanism $mathcal{M}$ for classic bidders to a mechanism $mathcal{M}$ for strategically reticent bidders, based on either outcome of expectation or expectation of outcome, respectively. We identify properties of the original mechanism $mathcal{M}$ under which the transformation leads to IIC mechanisms $mathcal{M}$. Interestingly, as corollaries of these results, we show that running VCG with expected bidder values maximizes welfare whereas the mechanism using expected outcome of Myersons auction maximizes revenue. Finally, we study how regulation on the auctioneers usage of information may lead to more robust mechanisms.
Can we learn a multi-class classifier from only data of a single class? We show that without any assumptions on the loss functions, models, and optimizers, we can successfully learn a multi-class classifier from only data of a single class with a rigorous consistency guarantee when confidences (i.e., the class-posterior probabilities for all the classes) are available. Specifically, we propose an empirical risk minimization framework that is loss-/model-/optimizer-independent. Instead of constructing a boundary between the given class and other classes, our method can conduct discriminative classification between all the classes even if no data from the other classes are provided. We further theoretically and experimentally show that our method can be Bayes-consistent with a simple modification even if the provided confidences are highly noisy. Then, we provide an extension of our method for the case where data from a subset of all the classes are available. Experimental results demonstrate the effectiveness of our methods.