أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Peter Kairouz

Lower Bounds for the Minimum Mean-Square Error via Neural Network-based Estimation

90 - Mario Diaz , Peter Kairouz , Lalitha Sankar 2021

The minimum mean-square error (MMSE) achievable by optimal estimation of a random variable $Yinmathbb{R}$ given another random variable $Xinmathbb{R}^{d}$ is of much interest in a variety of statistical contexts. In this paper we propose two estimato rs for the MMSE, one based on a two-layer neural network and the other on a special three-layer neural network. We derive lower bounds for the MMSE based on the proposed estimators and the Barron constant of an appropriate function of the conditional expectation of $Y$ given $X$. Furthermore, we derive a general upper bound for the Barron constant that, when $Xinmathbb{R}$ is post-processed by the additive Gaussian mechanism, produces order optimal estimates in the large noise regime.

نظرية المعلومات نظرية المعلومات

Back to the Drawing Board: A Critical Evaluation of Poisoning Attacks on Federated Learning

357 - Virat Shejwalkar , Amir Houmansadr , Peter Kairouz 2021

While recent works have indicated that federated learning (FL) is vulnerable to poisoning attacks by compromised clients, we show that these works make a number of unrealistic assumptions and arrive at somewhat misleading conclusions. For instance, t hey often use impractically high percentages of compromised clients or assume unrealistic capabilities for the adversary. We perform the first critical analysis of poisoning attacks under practical production FL environments by carefully characterizing the set of realistic threat models and adversarial capabilities. Our findings are rather surprising: contrary to the established belief, we show that FL, even without any defenses, is highly robust in practice. In fact, we go even further and propose novel, state-of-the-art poisoning attacks under two realistic threat models, and show via an extensive set of experiments across three benchmark datasets how (in)effective poisoning attacks are, especially when simple defense mechanisms are used. We correct previous misconceptions and give concrete guidelines that we hope will encourage our community to conduct more accurate research in this space and build stronger (and more realistic) attacks and defenses.

التعلم الآلي التشفير والأمن النظم الموزعة والتوازية والحوسبة العنقودية

Practical and Private (Deep) Learning without Sampling or Shuffling

166 - Peter Kairouz , Brendan McMahan , Shuang Song 2021

We consider training models with differential privacy (DP) using mini-batch gradients. The existing state-of-the-art, Differentially Private Stochastic Gradient Descent (DP-SGD), requires privacy amplification by sampling or shuffling to obtain the b est privacy/accuracy/computation trade-offs. Unfortunately, the precise requirements on exact sampling and shuffling can be hard to obtain in important practical scenarios, particularly federated learning (FL). We design and analyze a DP variant of Follow-The-Regularized-Leader (DP-FTRL) that compares favorably (both theoretically and empirically) to amplified DP-SGD, while allowing for much more flexible data access patterns. DP-FTRL does not use any form of privacy amplification. The code is available at https://github.com/google-research/federated/tree/master/dp_ftrl and https://github.com/google-research/DP-FTRL .

التشفير والأمن التعلم الآلي

Estimating Sparse Discrete Distributions Under Local Privacy and Communication Constraints

52 - Jayadev Acharya , Peter Kairouz , Yuhan Liu 2020

We consider the problem of estimating sparse discrete distributions under local differential privacy (LDP) and communication constraints. We characterize the sample complexity for sparse estimation under LDP constraints up to a constant factor and th e sample complexity under communication constraints up to a logarithmic factor. Our upper bounds under LDP are based on the Hadamard Response, a private coin scheme that requires only one bit of communication per user. Under communication constraints, we propose public coin schemes based on random hashing functions. Our tight lower bounds are based on the recently proposed method of chi squared contractions.

نظرية المعلومات التشفير والأمن بنى وهياكل البيانات والخوارزميات

Fast Dimension Independent Private AdaGrad on Publicly Estimated Subspaces

41 - Peter Kairouz , Monica Ribero , Keith Rush 2020

We revisit the problem of empirical risk minimziation (ERM) with differential privacy. We show that noisy AdaGrad, given appropriate knowledge and conditions on the subspace from which gradients can be drawn, achieves a regret comparable to tradition al AdaGrad plus a well-controlled term due to noise. We show a convergence rate of $O(text{Tr}(G_T)/T)$, where $G_T$ captures the geometry of the gradient subspace. Since $text{Tr}(G_T)=O(sqrt{T})$ we can obtain faster rates for convex and Lipschitz functions, compared to the $O(1/sqrt{T})$ rate achieved by kno

التعلم الآلي التعلم الالي

Advances and Open Problems in Federated Learning

398 - Peter Kairouz , H. Brendan McMahan , Brendan Avent 2019

Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data d ecentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches. Motivated by the explosive growth in FL research, this paper discusses recent advances and presents an extensive collection of open problems and challenges.

التعلم الآلي التشفير والأمن التعلم الالي

Siamese Generative Adversarial Privatizer for Biometric Data

151 - Witold Oleszkiewicz , Peter Kairouz , Karol Piczak 2018

State-of-the-art machine learning algorithms can be fooled by carefully crafted adversarial examples. As such, adversarial examples present a concrete problem in AI safety. In this work we turn the tables and ask the following question: can we harnes s the power of adversarial examples to prevent malicious adversaries from learning identifying information from data while allowing non-malicious entities to benefit from the utility of the same data? For instance, can we use adversarial examples to anonymize biometric dataset of faces while retaining usefulness of this data for other purposes, such as emotion recognition? To address this question, we propose a simple yet effective method, called Siamese Generative Adversarial Privatizer (SGAP), that exploits the properties of a Siamese neural network to find discriminative features that convey identifying information. When coupled with a generative model, our approach is able to correctly locate and disguise identifying information, while minimally reducing the utility of the privatized dataset. Extensive evaluation on a biometric dataset of fingerprints and cartoon faces confirms usefulness of our simple yet effective method.

الرؤية الحاسوبية وتمييز الأنماط

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد