Differential Privacy on Finite Computers

57 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Victor Balcer

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Victor Balcer - Salil Vadhan

بنى وهياكل البيانات والخوارزميات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We consider the problem of designing and analyzing differentially private algorithms that can be implemented on {em discrete} models of computation in {em strict} polynomial time, motivated by known attacks on floating point implementations of real-arithmetic differentially private algorithms (Mironov, CCS 2012) and the potential for timing attacks on expected polynomial-time algorithms. As a case study, we examine the basic problem of approximating the histogram of a categorical dataset over a possibly large data universe $mathcal{X}$. The classic Laplace Mechanism (Dwork, McSherry, Nissim, Smith, TCC 2006 and J. Privacy & Confidentiality 2017) does not satisfy our requirements, as it is based on real arithmetic, and natural discrete analogues, such as the Geometric Mechanism (Ghosh, Roughgarden, Sundarajan, STOC 2009 and SICOMP 2012), take time at least linear in $|mathcal{X}|$, which can be exponential in the bit length of the input. In this paper, we provide strict polynomial-time discrete algorithms for approximate histograms whose simultaneous accuracy (the maximum error over all bins) matches that of the Laplace Mechanism up to constant factors, while retaining the same (pure) differential privacy guarantee. One of our algorithms produces a sparse histogram as output. Its per-bin accuracy (the error on individual bins) is worse than that of the Laplace Mechanism by a factor of $log|mathcal{X}|$, but we prove a lower bound showing that this is necessary for any algorithm that produces a sparse histogram. A second algorithm avoids this lower bound, and matches the per-bin accuracy of the Laplace Mechanism, by producing a compact and efficiently computable representation of a dense histogram, it is based on an $(n+1)$-wise independent implementation of an appropriately clamped version of the Discrete Geometric Mechanism.

قيم البحث

136 - Sivakanth Gopi , Yin Tat Lee , Lukas Wutschitz 2021

We give a fast algorithm to optimally compose privacy guarantees of differentially private (DP) algorithms to arbitrary accuracy. Our method is based on the notion of privacy loss random variables to quantify the privacy loss of DP algorithms. The ru nning time and memory needed for our algorithm to approximate the privacy curve of a DP algorithm composed with itself $k$ times is $tilde{O}(sqrt{k})$. This improves over the best prior method by Koskela et al. (2020) which requires $tilde{Omega}(k^{1.5})$ running time. We demonstrate the utility of our algorithm by accurately computing the privacy loss of DP-SGD algorithm of Abadi et al. (2016) and showing that our algorithm speeds up the privacy computations by a few orders of magnitude compared to prior work, while maintaining similar accuracy.

بنى وهياكل البيانات والخوارزميات التشفير والأمن التعلم الآلي

Adversarially Robust Streaming Algorithms via Differential Privacy

113 - Avinatan Hassidim , Haim Kaplan , Yishay Mansour 2020

A streaming algorithm is said to be adversarially robust if its accuracy guarantees are maintained even when the data stream is chosen maliciously, by an adaptive adversary. We establish a connection between adversarial robustness of streaming algori thms and the notion of differential privacy. This connection allows us to design new adversarially robust streaming algorithms that outperform the current state-of-the-art constructions for many interesting regimes of parameters.

بنى وهياكل البيانات والخوارزميات التعلم الآلي

Statistical Inference in the Differential Privacy Model

147 - Huanyu Zhang 2021

In modern settings of data analysis, we may be running our algorithms on datasets that are sensitive in nature. However, classical machine learning and statistical algorithms were not designed with these risks in mind, and it has been demonstrated th at they may reveal personal information. These concerns disincentivize individuals from providing their data, or even worse, encouraging intentionally providing fake data. To assuage these concerns, we import the constraint of differential privacy to the statistical inference, considered by many to be the gold standard of data privacy. This thesis aims to quantify the cost of ensuring differential privacy, i.e., understanding how much additional data is required to perform data analysis with the constraint of differential privacy. Despite the maturity of the literature on differential privacy, there is still inadequate understanding in some of the most fundamental settings. In particular, we make progress in the following problems: $bullet$ What is the sample complexity of DP hypothesis testing? $bullet$ Can we privately estimate distribution properties with a negligible cost? $bullet$ What is the fundamental limit in private distribution estimation? $bullet$ How can we design algorithms to privately estimate random graphs? $bullet$ What is the trade-off between the sample complexity and the interactivity in private hypothesis selection?

بنى وهياكل البيانات والخوارزميات التشفير والأمن نظرية المعلومات

A Framework for Adversarial Streaming via Differential Privacy and Difference Estimators

47 - Idan Attias , Edith Cohen , Moshe Shechner 2021

Streaming algorithms are algorithms for processing large data streams, using only a limited amount of memory. Classical streaming algorithms operate under the assumption that the input stream is fixed in advance. Recently, there is a growing interest in studying streaming algorithms that provide provable guarantees even when the input stream is chosen by an adaptive adversary. Such streaming algorithms are said to be {em adversarially-robust}. We propose a novel framework for adversarial streaming that hybrids two recently suggested frameworks by Hassidim et al. (2020) and by Woodruff and Zhou (2021). These recently suggested frameworks rely on very different ideas, each with its own strengths and weaknesses. We combine these two frameworks (in a non-trivial way) into a single hybrid framework that gains from both approaches to obtain superior performances for turnstile streams.

بنى وهياكل البيانات والخوارزميات التعلم الآلي

Geo-Graph-Indistinguishability: Location Privacy on Road Networks Based on Differential Privacy

154 - Shun Takagi , Yang Cao , Yasuhito Asano 2020

In recent years, concerns about location privacy are increasing with the spread of location-based services (LBSs). Many methods to protect location privacy have been proposed in the past decades. Especially, perturbation methods based on Geo-Indistin guishability (Geo-I), which randomly perturb a true location to a pseudolocation, are getting attention due to its strong privacy guarantee inherited from differential privacy. However, Geo-I is based on the Euclidean plane even though many LBSs are based on road networks (e.g. ride-sharing services). This causes unnecessary noise and thus an insufficient tradeoff between utility and privacy for LBSs on road networks. To address this issue, we propose a new privacy notion, Geo-Graph-Indistinguishability (GG-I), for locations on a road network to achieve a better tradeoff. We propose Graph-Exponential Mechanism (GEM), which satisfies GG-I. Moreover, we formalize the optimization problem to find the optimal GEM in terms of the tradeoff. However, the computational complexity of a naive method to find the optimal solution is prohibitive, so we propose a greedy algorithm to find an approximate solution in an acceptable amount of time. Finally, our experiments show that our proposed mechanism outperforms a Geo-Is mechanism with respect to the tradeoff.

التشفير والأمن

سجل دخول لتتمكن من نشر تعليقات