بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Optimizing error of high-dimensional statistical queries under differential privacy

174 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ryan McKenna

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ryan McKenna - Gerome Miklau - Michael Hay

قواعد البيانات التشفير والأمن

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Differentially private algorithms for answering sets of predicate counting queries on a sensitive database have many applications. Organizations that collect individual-level data, such as statistical agencies and medical institutions, use them to safely release summary tabulations. However, existing techniques are accurate only on a narrow class of query workloads, or are extremely slow, especially when analyzing more than one or two dimensions of the data. In this work we propose HDMM, a new differentially private algorithm for answering a workload of predicate counting queries, that is especially effective for higher-dimensional datasets. HDMM represents query workloads using an implicit matrix representation and exploits this compact representation to efficiently search (a subset of) the space of differentially private algorithms for one that answers the input query workload with high accuracy. We empirically show that HDMM can efficiently answer queries with lower error than state-of-the-art techniques on a variety of low and high dimensional datasets.

قيم البحث

62 - Ryan McKenna , Gerome Miklau , Michael Hay 2021

In this work we describe the High-Dimensional Matrix Mechanism (HDMM), a differentially private algorithm for answering a workload of predicate counting queries. HDMM represents query workloads using a compact implicit matrix representation and explo its this representation to efficiently optimize over (a subset of) the space of differentially private algorithms for one that is unbiased and answers the input query workload with low expected error. HDMM can be deployed for both $epsilon$-differential privacy (with Laplace noise) and $(epsilon, delta)$-differential privacy (with Gaussian noise), although the core techniques are slightly different for each. We demonstrate empirically that HDMM can efficiently answer queries with lower expected error than state-of-the-art techniques, and in some cases, it nearly matches existing lower bounds for the particular class of mechanisms we consider.

قواعد البيانات التشفير والأمن

Answering Summation Queries for Numerical Attributes under Differential Privacy

81 - Yikai Wu , David Pujol , Ios Kotsogiannis 2019

In this work we explore the problem of answering a set of sum queries under Differential Privacy. This is a little understood, non-trivial problem especially in the case of numerical domains. We show that traditional techniques from the literature ar e not always the best choice and a more rigorous approach is necessary to develop low error algorithms.

قواعد البيانات التشفير والأمن

A workload-adaptive mechanism for linear queries under local differential privacy

91 - Ryan McKenna , Raj Kumar Maity , Arya Mazumdar 2020

We propose a new mechanism to accurately answer a user-provided set of linear counting queries under local differential privacy (LDP). Given a set of linear counting queries (the workload) our mechanism automatically adapts to provide accuracy on the workload queries. We define a parametric class of mechanisms that produce unbiased estimates of the workload, and formulate a constrained optimization problem to select a mechanism from this class that minimizes expected total squared error. We solve this optimization problem numerically using projected gradient descent and provide an efficient implementation that scales to large workloads. We demonstrate the effectiveness of our optimization-based approach in a wide variety of settings, showing that it outperforms many competitors, even outperforming existing mechanisms on the workloads for which they were intended.

قواعد البيانات التشفير والأمن

Optimizing the Numbers of Queries and Replies in Federated Learning with Differential Privacy

131 - Yipeng Zhou , Xuezheng Liu , Yao Fu 2021

Federated learning (FL) empowers distributed clients to collaboratively train a shared machine learning model through exchanging parameter information. Despite the fact that FL can protect clients raw data, malicious users can still crack original da ta with disclosed parameters. To amend this flaw, differential privacy (DP) is incorporated into FL clients to disturb original parameters, which however can significantly impair the accuracy of the trained model. In this work, we study a crucial question which has been vastly overlooked by existing works: what are the optimal numbers of queries and replies in FL with DP so that the final model accuracy is maximized. In FL, the parameter server (PS) needs to query participating clients for multiple global iterations to complete training. Each client responds a query from the PS by conducting a local iteration. Our work investigates how many times the PS should query clients and how many times each client should reply the PS. We investigate two most extensively used DP mechanisms (i.e., the Laplace mechanism and Gaussian mechanisms). Through conducting convergence rate analysis, we can determine the optimal numbers of queries and replies in FL with DP so that the final model accuracy can be maximized. Finally, extensive experiments are conducted with publicly available datasets: MNIST and FEMNIST, to verify our analysis and the results demonstrate that properly setting the numbers of queries and replies can significantly improve the final model accuracy in FL with DP.

التعلم الآلي التشفير والأمن النظم الموزعة والتوازية والحوسبة العنقودية

Frequency Estimation under Local Differential Privacy [Experiments, Analysis and Benchmarks]

82 - Graham Cormode , Samuel Maddock , Carsten Maple 2021

Private collection of statistics from a large distributed population is an important problem, and has led to large scale deployments from several leading technology companies. The dominant approach requires each user to randomly perturb their input, leading to guarantees in the local differential privacy model. In this paper, we place the various approaches that have been suggested into a common framework, and perform an extensive series of experiments to understand the tradeoffs between different implementation choices. Our conclusion is that for the core problems of frequency estimation and heavy hitter identification, careful choice of algorithms can lead to very effective solutions that scale to millions of users

قواعد البيانات التشفير والأمن

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

المعهد العالي لإدارة الأعمال

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Optimizing error of high-dimensional statistical queries under differential privacy

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً