ﻻ يوجد ملخص باللغة العربية
In modern settings of data analysis, we may be running our algorithms on datasets that are sensitive in nature. However, classical machine learning and statistical algorithms were not designed with these risks in mind, and it has been demonstrated that they may reveal personal information. These concerns disincentivize individuals from providing their data, or even worse, encouraging intentionally providing fake data. To assuage these concerns, we import the constraint of differential privacy to the statistical inference, considered by many to be the gold standard of data privacy. This thesis aims to quantify the cost of ensuring differential privacy, i.e., understanding how much additional data is required to perform data analysis with the constraint of differential privacy. Despite the maturity of the literature on differential privacy, there is still inadequate understanding in some of the most fundamental settings. In particular, we make progress in the following problems: $bullet$ What is the sample complexity of DP hypothesis testing? $bullet$ Can we privately estimate distribution properties with a negligible cost? $bullet$ What is the fundamental limit in private distribution estimation? $bullet$ How can we design algorithms to privately estimate random graphs? $bullet$ What is the trade-off between the sample complexity and the interactivity in private hypothesis selection?
We study privacy in a distributed learning framework, where clients collaboratively build a learning model iteratively through interactions with a server from whom we need privacy. Motivated by stochastic optimization and the federated learning (FL)
Many privacy mechanisms reveal high-level information about a data distribution through noisy measurements. It is common to use this information to estimate the answers to new queries. In this work, we provide an approach to solve this estimation pro
We give a fast algorithm to optimally compose privacy guarantees of differentially private (DP) algorithms to arbitrary accuracy. Our method is based on the notion of privacy loss random variables to quantify the privacy loss of DP algorithms. The ru
We study how to communicate findings of Bayesian inference to third parties, while preserving the strong guarantee of differential privacy. Our main contributions are four different algorithms for private Bayesian inference on proba-bilistic graphica
LDP (Local Differential Privacy) has been widely studied to estimate statistics of personal data (e.g., distribution underlying the data) while protecting users privacy. Although LDP does not require a trusted third party, it regards all personal dat