Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Optimizing error of high-dimensional statistical queries under differential privacy

174 0 0.0 ( 0 )

Download Cite

Added by Ryan McKenna

Publication date 2018

fields Informatics Engineering

and research's language is English

Authors Ryan McKenna - Gerome Miklau - Michael Hay

Databases Cryptography and Security

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Differentially private algorithms for answering sets of predicate counting queries on a sensitive database have many applications. Organizations that collect individual-level data, such as statistical agencies and medical institutions, use them to safely release summary tabulations. However, existing techniques are accurate only on a narrow class of query workloads, or are extremely slow, especially when analyzing more than one or two dimensions of the data. In this work we propose HDMM, a new differentially private algorithm for answering a workload of predicate counting queries, that is especially effective for higher-dimensional datasets. HDMM represents query workloads using an implicit matrix representation and exploits this compact representation to efficiently search (a subset of) the space of differentially private algorithms for one that answers the input query workload with high accuracy. We empirically show that HDMM can efficiently answer queries with lower error than state-of-the-art techniques on a variety of low and high dimensional datasets.

rate research

HDMM: Optimizing error of high-dimensional statistical queries under differential privacy

62 - Ryan McKenna , Gerome Miklau , Michael Hay 2021

In this work we describe the High-Dimensional Matrix Mechanism (HDMM), a differentially private algorithm for answering a workload of predicate counting queries. HDMM represents query workloads using a compact implicit matrix representation and exploits this representation to efficiently optimize over (a subset of) the space of differentially private algorithms for one that is unbiased and answers the input query workload with low expected error. HDMM can be deployed for both $epsilon$-differential privacy (with Laplace noise) and $(epsilon, delta)$-differential privacy (with Gaussian noise), although the core techniques are slightly different for each. We demonstrate empirically that HDMM can efficiently answer queries with lower expected error than state-of-the-art techniques, and in some cases, it nearly matches existing lower bounds for the particular class of mechanisms we consider.

Databases Cryptography and Security

Answering Summation Queries for Numerical Attributes under Differential Privacy

81 - Yikai Wu , David Pujol , Ios Kotsogiannis 2019

In this work we explore the problem of answering a set of sum queries under Differential Privacy. This is a little understood, non-trivial problem especially in the case of numerical domains. We show that traditional techniques from the literature are not always the best choice and a more rigorous approach is necessary to develop low error algorithms.

Databases Cryptography and Security

A workload-adaptive mechanism for linear queries under local differential privacy

91 - Ryan McKenna , Raj Kumar Maity , Arya Mazumdar 2020

We propose a new mechanism to accurately answer a user-provided set of linear counting queries under local differential privacy (LDP). Given a set of linear counting queries (the workload) our mechanism automatically adapts to provide accuracy on the workload queries. We define a parametric class of mechanisms that produce unbiased estimates of the workload, and formulate a constrained optimization problem to select a mechanism from this class that minimizes expected total squared error. We solve this optimization problem numerically using projected gradient descent and provide an efficient implementation that scales to large workloads. We demonstrate the effectiveness of our optimization-based approach in a wide variety of settings, showing that it outperforms many competitors, even outperforming existing mechanisms on the workloads for which they were intended.

Databases Cryptography and Security

Optimizing the Numbers of Queries and Replies in Federated Learning with Differential Privacy

131 - Yipeng Zhou , Xuezheng Liu , Yao Fu 2021

Federated learning (FL) empowers distributed clients to collaboratively train a shared machine learning model through exchanging parameter information. Despite the fact that FL can protect clients raw data, malicious users can still crack original data with disclosed parameters. To amend this flaw, differential privacy (DP) is incorporated into FL clients to disturb original parameters, which however can significantly impair the accuracy of the trained model. In this work, we study a crucial question which has been vastly overlooked by existing works: what are the optimal numbers of queries and replies in FL with DP so that the final model accuracy is maximized. In FL, the parameter server (PS) needs to query participating clients for multiple global iterations to complete training. Each client responds a query from the PS by conducting a local iteration. Our work investigates how many times the PS should query clients and how many times each client should reply the PS. We investigate two most extensively used DP mechanisms (i.e., the Laplace mechanism and Gaussian mechanisms). Through conducting convergence rate analysis, we can determine the optimal numbers of queries and replies in FL with DP so that the final model accuracy can be maximized. Finally, extensive experiments are conducted with publicly available datasets: MNIST and FEMNIST, to verify our analysis and the results demonstrate that properly setting the numbers of queries and replies can significantly improve the final model accuracy in FL with DP.

Machine Learning Cryptography and Security Distributed Parallel and Cluster Computing

Frequency Estimation under Local Differential Privacy [Experiments, Analysis and Benchmarks]

82 - Graham Cormode , Samuel Maddock , Carsten Maple 2021

Private collection of statistics from a large distributed population is an important problem, and has led to large scale deployments from several leading technology companies. The dominant approach requires each user to randomly perturb their input, leading to guarantees in the local differential privacy model. In this paper, we place the various approaches that have been suggested into a common framework, and perform an extensive series of experiments to understand the tradeoffs between different implementation choices. Our conclusion is that for the core problems of frequency estimation and heavy hitter identification, careful choice of algorithms can lead to very effective solutions that scale to millions of users

Databases Cryptography and Security

comments

Fetching comments

Aِl-Baath University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Optimizing error of high-dimensional statistical queries under differential privacy

Ask ChatGPT about the research

No Arabic abstract

Read More