ﻻ يوجد ملخص باللغة العربية
Differentially private algorithms for answering sets of predicate counting queries on a sensitive database have many applications. Organizations that collect individual-level data, such as statistical agencies and medical institutions, use them to safely release summary tabulations. However, existing techniques are accurate only on a narrow class of query workloads, or are extremely slow, especially when analyzing more than one or two dimensions of the data. In this work we propose HDMM, a new differentially private algorithm for answering a workload of predicate counting queries, that is especially effective for higher-dimensional datasets. HDMM represents query workloads using an implicit matrix representation and exploits this compact representation to efficiently search (a subset of) the space of differentially private algorithms for one that answers the input query workload with high accuracy. We empirically show that HDMM can efficiently answer queries with lower error than state-of-the-art techniques on a variety of low and high dimensional datasets.
In this work we describe the High-Dimensional Matrix Mechanism (HDMM), a differentially private algorithm for answering a workload of predicate counting queries. HDMM represents query workloads using a compact implicit matrix representation and explo
In this work we explore the problem of answering a set of sum queries under Differential Privacy. This is a little understood, non-trivial problem especially in the case of numerical domains. We show that traditional techniques from the literature ar
We propose a new mechanism to accurately answer a user-provided set of linear counting queries under local differential privacy (LDP). Given a set of linear counting queries (the workload) our mechanism automatically adapts to provide accuracy on the
Federated learning (FL) empowers distributed clients to collaboratively train a shared machine learning model through exchanging parameter information. Despite the fact that FL can protect clients raw data, malicious users can still crack original da
Private collection of statistics from a large distributed population is an important problem, and has led to large scale deployments from several leading technology companies. The dominant approach requires each user to randomly perturb their input,