أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Zhiwei Steven Wu

Private Multi-Task Learning: Formulation and Applications to Federated Learning

140 - Shengyuan Hu , Zhiwei Steven Wu , Virginia Smith 2021

Many problems in machine learning rely on multi-task learning (MTL), in which the goal is to solve multiple related machine learning tasks simultaneously. MTL is particularly relevant for privacy-sensitive applications in areas such as healthcare, fi nance, and IoT computing, where sensitive data from multiple, varied sources are shared for the purpose of learning. In this work, we formalize notions of task-level privacy for MTL via joint differential privacy(JDP), a relaxation of differential privacy for mechanism design and distributed optimization. We then propose an algorithm for mean-regularized MTL, an objective commonly used for applications in personalized federated learning, subject to JDP. We analyze our objective and solver, providing certifiable guarantees on both privacy and utility. Empirically, we find that our method allows for improved privacy/utility trade-offs relative to global baselines across common federated learning benchmarks.

التعلم الآلي التشفير والأمن

Iterative Methods for Private Synthetic Data: Unifying Framework and New Methods

78 - Terrance Liu , Giuseppe Vietri , Zhiwei Steven Wu 2021

We study private synthetic data generation for query release, where the goal is to construct a sanitized version of a sensitive dataset, subject to differential privacy, that approximately preserves the answers to a large collection of statistical qu eries. We first present an algorithmic framework that unifies a long line of iterative algorithms in the literature. Under this framework, we propose two new methods. The first method, private entropy projection (PEP), can be viewed as an advanced variant of MWEM that adaptively reuses past query measurements to boost accuracy. Our second method, generative networks with the exponential mechanism (GEM), circumvents computational bottlenecks in algorithms such as MWEM and PEP by optimizing over generative models parameterized by neural networks, which capture a rich family of distributions while enabling fast gradient-based optimization. We demonstrate that PEP and GEM empirically outperform existing algorithms. Furthermore, we show that GEM nicely incorporates prior information from public data while overcoming limitations of PMW^Pub, the existing state-of-the-art method that also leverages public data.

التعلم الآلي التشفير والأمن بنى وهياكل البيانات والخوارزميات

Private Post-GAN Boosting

65 - Marcel Neunhoeffer , Zhiwei Steven Wu , Cynthia Dwork 2020

Differentially private GANs have proven to be a promising approach for generating realistic synthetic data without compromising the privacy of individuals. Due to the privacy-protective noise introduced in the training, the convergence of GANs become s even more elusive, which often leads to poor utility in the output generator at the end of training. We propose Private post-GAN boosting (Private PGB), a differentially private method that combines samples produced by the sequence of generators obtained during GAN training to create a high-quality synthetic dataset. To that end, our method leverages the Private Multiplicative Weights method (Hardt and Rothblum, 2010) to reweight generated samples. We evaluate Private PGB on two dimensional toy data, MNIST images, US Census data and a standard machine learning prediction task. Our experiments show that Private PGB improves upon a standard private GAN approach across a collection of quality measures. We also provide a non-private variant of PGB that improves the data quality of standard GAN training.

التعلم الآلي التشفير والأمن أجهزة الكمبيوتر والمجتمع

Bypassing the Ambient Dimension: Private SGD with Gradient Subspace Identification

72 - Yingxue Zhou , Zhiwei Steven Wu , Arindam Banerjee 2020

Differentially private SGD (DP-SGD) is one of the most popular methods for solving differentially private empirical risk minimization (ERM). Due to its noisy perturbation on each gradient update, the error rate of DP-SGD scales with the ambient dimen sion $p$, the number of parameters in the model. Such dependence can be problematic for over-parameterized models where $p gg n$, the number of training samples. Existing lower bounds on private ERM show that such dependence on $p$ is inevitable in the worst case. In this paper, we circumvent the dependence on the ambient dimension by leveraging a low-dimensional structure of gradient space in deep networks -- that is, the stochastic gradients for deep nets usually stay in a low dimensional subspace in the training process. We propose Projected DP-SGD that performs noise reduction by projecting the noisy gradients to a low-dimensional subspace, which is given by the top gradient eigenspace on a small public dataset. We provide a general sample complexity analysis on the public dataset for the gradient subspace identification problem and demonstrate that under certain low-dimensional assumptions the public sample complexity only grows logarithmically in $p$. Finally, we provide a theoretical analysis and empirical evaluations to show that our method can substantially improve the accuracy of DP-SGD in the high privacy regime (corresponding to low privacy loss $epsilon$).

التعلم الآلي التشفير والأمن التعلم الالي

Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis

72 - Vidyashankar Sivakumar , Zhiwei Steven Wu , Arindam Banerjee 2020

Bandit learning algorithms typically involve the balance of exploration and exploitation. However, in many practical applications, worst-case scenarios needing systematic exploration are seldom encountered. In this work, we consider a smoothed settin g for structured linear contextual bandits where the adversarial contexts are perturbed by Gaussian noise and the unknown parameter $theta^*$ has structure, e.g., sparsity, group sparsity, low rank, etc. We propose simple greedy algorithms for both the single- and multi-parameter (i.e., different parameter for each context) settings and provide a unified regret analysis for $theta^*$ with any assumed structure. The regret bounds are expressed in terms of geometric quantities such as Gaussian widths associated with the structure of $theta^*$. We also obtain sharper regret bounds compared to earlier work for the unstructured $theta^*$ setting as a consequence of our improved analysis. We show there is implicit exploration in the smoothed setting where a simple greedy algorithm works.

التعلم الآلي نظرية الإحصاء التعلم الالي

Fair Regression: Quantitative Definitions and Reduction-based Algorithms

127 - Alekh Agarwal , Miroslav Dudik , Zhiwei Steven Wu 2019

In this paper, we study the prediction of a real-valued target, such as a risk score or recidivism rate, while guaranteeing a quantitative notion of fairness with respect to a protected attribute such as gender or race. We call this class of problems emph{fair regression}. We propose general schemes for fair regression under two notions of fairness: (1) statistical parity, which asks that the prediction be statistically independent of the protected attribute, and (2) bounded group loss, which asks that the prediction error restricted to any protected group remain below some pre-determined level. While we only study these two notions of fairness, our schemes are applicable to arbitrary Lipschitz-continuous losses, and so they encompass least-squares regression, logistic regression, quantile regression, and many other tasks. Our schemes only require access to standard risk minimization algorithms (such as standard classification or least-squares regression) while providing theoretical guarantees on the optimality and fairness of the obtained solutions. In addition to analyzing theoretical properties of our schemes, we empirically demonstrate their ability to uncover fairness--accuracy frontiers on several standard datasets.

التعلم الآلي التعلم الالي

How to Use Heuristics for Differential Privacy

166 - Seth Neel , Aaron Roth , Zhiwei Steven Wu 2018

We develop theory for using heuristics to solve computationally hard problems in differential privacy. Heuristic approaches have enjoyed tremendous success in machine learning, for which performance can be empirically evaluated. However, privacy guar antees cannot be evaluated empirically, and must be proven --- without making heuristic assumptions. We show that learning problems over broad classes of functions can be solved privately and efficiently, assuming the existence of a non-private oracle for solving the same problem. Our first algorithm yields a privacy guarantee that is contingent on the correctness of the oracle. We then give a reduction which applies to a class of heuristics which we call certifiable, which allows us to convert oracle-dependent privacy guarantees to worst-case privacy guarantee that hold even when the heuristic standing in for the oracle might fail in adversarial ways. Finally, we consider a broad class of functions that includes most classes of simple boolean functions studied in the PAC learning literature, including conjunctions, disjunctions, parities, and discrete halfspaces. We show that there is an efficient algorithm for privately constructing synthetic data for any such class, given a non-private learning oracle. This in particular gives the first oracle-efficient algorithm for privately generating synthetic data for contingency tables. The most intriguing question left open by our work is whether or not every problem that can be solved differentially privately can be privately solved with an oracle-efficient algorithm. While we do not resolve this, we give a barrier result that suggests that any generic oracle-efficient reduction must fall outside of a natural class of algorithms (which includes the algorithms given in this paper).

التعلم الآلي التشفير والأمن بنى وهياكل البيانات والخوارزميات

Orthogonal Random Forest for Causal Inference

106 - Miruna Oprescu , Vasilis Syrgkanis , Zhiwei Steven Wu 2018

We propose the orthogonal random forest, an algorithm that combines Neyman-orthogonality to reduce sensitivity with respect to estimation error of nuisance parameters with generalized random forests (Athey et al., 2017)--a flexible non-parametric met hod for statistical estimation of conditional moment models using random forests. We provide a consistency rate and establish asymptotic normality for our estimator. We show that under mild assumptions on the consistency rate of the nuisance estimator, we can achieve the same error rate as an oracle with a priori knowledge of these nuisance parameters. We show that when the nuisance functions have a locally sparse parametrization, then a local $ell_1$-penalized regression achieves the required rate. We apply our method to estimate heterogeneous treatment effects from observational data with discrete treatments or continuous treatments, and we show that, unlike prior work, our method provably allows to control for a high-dimensional set of variables under standard sparsity conditions. We also provide a comprehensive empirical evaluation of our algorithm on both synthetic and real data.

التعلم الآلي الاقتصاد القياسي نظرية الإحصاء

Competing Bandits: Learning under Competition

302 - Yishay Mansour , Aleksandrs Slivkins , Zhiwei Steven Wu 2017

Most modern systems strive to learn from interactions with users, and many engage in exploration: making potentially suboptimal choices for the sake of acquiring new information. We initiate a study of the interplay between exploration and competitio n--how such systems balance the exploration for learning and the competition for users. Here the users play three distinct roles: they are customers that generate revenue, they are sources of data for learning, and they are self-interested agents which choose among the competing systems. In our model, we consider competition between two multi-armed bandit algorithms faced with the same bandit instance. Users arrive one by one and choose among the two algorithms, so that each algorithm makes progress if and only if it is chosen. We ask whether and to what extent competition incentivizes the adoption of better bandit algorithms. We investigate this issue for several models of user response, as we vary the degree of rationality and competitiveness in the model. Our findings are closely related to the competition vs. innovation relationship, a well-studied theme in economics.

علوم الكمبيوتر ونظرية الألعاب التعلم الآلي

Logarithmic Query Complexity for Approximate Nash Computation in Large Games

393 - Paul W. Goldberg , Francisco J. Marmolejo-Cossio , Zhiwei Steven Wu 2016

We investigate the problem of equilibrium computation for large $n$-player games. Large games have a Lipschitz-type property that no single players utility is greatly affected by any other individual players actions. In this paper, we mostly focus on the case where any change of strategy by a player causes other players payoffs to change by at most $frac{1}{n}$. We study algorithms having query access to the games payoff function, aiming to find $epsilon$-Nash equilibria. We seek algorithms that obtain $epsilon$ as small as possible, in time polynomial in $n$. Our main result is a randomised algorithm that achieves $epsilon$ approaching $frac{1}{8}$ for 2-strategy games in a {em completely uncoupled} setting, where each player observes her own payoff to a query, and adjusts her behaviour independently of other players payoffs/actions. $O(log n)$ rounds/queries are required. We also show how to obtain a slight improvement over $frac{1}{8}$, by introducing a small amount of communication between the players. Finally, we give extension of our results to large games with more than two strategies per player, and alternative largeness parameters.

علوم الكمبيوتر ونظرية الألعاب بنى وهياكل البيانات والخوارزميات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد