Privacy Budget Scheduling

59 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Pierre Tholoniat

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Tao Luo - Mingen Pan - Pierre Tholoniat

التشفير والأمن النظم الموزعة والتوازية والحوسبة العنقودية التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Machine learning (ML) models trained on personal data have been shown to leak information about users. Differential privacy (DP) enables model training with a guaranteed bound on this leakage. Each new model trained with DP increases the bound on data leakage and can be seen as consuming part of a global privacy budget that should not be exceeded. This budget is a scarce resource that must be carefully managed to maximize the number of successfully trained models. We describe PrivateKube, an extension to the popular Kubernetes datacenter orchestrator that adds privacy as a new type of resource to be managed alongside other traditional compute resources, such as CPU, GPU, and memory. The abstractions we design for the privacy resource mirror those defined by Kubernetes for traditional resources, but there are also major differences. For example, traditional compute resources are replenishable while privacy is not: a CPU can be regained after a model finishes execution while privacy budget cannot. This distinction forces a re-design of the scheduler. We present DPF (Dominant Private Block Fairness) -- a variant of the popular Dominant Resource Fairness (DRF) algorithm -- that is geared toward the non-replenishable privacy resource but enjoys similar theoretical properties as DRF. We evaluate PrivateKube and DPF on microbenchmarks and an ML workload on Amazon Reviews data. Compared to existing baselines, DPF allows training more models under the same global privacy guarantee. This is especially true for DPF over Renyi DP, a highly composable form of DP.

قيم البحث

136 - Ashish Dandekar , Debabrota Basu , Stephane Bressan 2020

The calibration of noise for a privacy-preserving mechanism depends on the sensitivity of the query and the prescribed privacy level. A data steward must make the non-trivial choice of a privacy level that balances the requirements of users and the m onetary constraints of the business entity. We analyse roles of the sources of randomness, namely the explicit randomness induced by the noise distribution and the implicit randomness induced by the data-generation distribution, that are involved in the design of a privacy-preserving mechanism. The finer analysis enables us to provide stronger privacy guarantees with quantifiable risks. Thus, we propose privacy at risk that is a probabilistic calibration of privacy-preserving mechanisms. We provide a composition theorem that leverages privacy at risk. We instantiate the probabilistic calibration for the Laplace mechanism by providing analytical results. We also propose a cost model that bridges the gap between the privacy level and the compensation budget estimated by a GDPR compliant business entity. The convexity of the proposed cost model leads to a unique fine-tuning of privacy level that minimises the compensation budget. We show its effectiveness by illustrating a realistic scenario that avoids overestimation of the compensation budget by using privacy at risk for the Laplace mechanism. We quantitatively show that composition using the cost optimal privacy at risk provides stronger privacy guarantee than the classical advanced composition.

التشفير والأمن بنى وهياكل البيانات والخوارزميات التعلم الآلي

CloudMine: Multi-Party Privacy-Preserving Data Analytics Service

396 - Dinh Tien Tuan Anh , Quach Vinh Thanh , Anwitaman Datta 2012

An increasing number of businesses are replacing their data storage and computation infrastructure with cloud services. Likewise, there is an increased emphasis on performing analytics based on multiple datasets obtained from different data sources. While ensuring security of data and computation outsourced to a third party cloud is in itself challenging, supporting analytics using data distributed across multiple, independent clouds is even further from trivial. In this paper we present CloudMine, a cloud-based service which allows multiple data owners to perform privacy-preserved computation over the joint data using their clouds as delegates. CloudMine protects data privacy with respect to semi-honest data owners and semi-honest clouds. It furthermore ensures the privacy of the computation outputs from the curious clouds. It allows data owners to reliably detect if their cloud delegates have been lazy when carrying out the delegated computation. CloudMine can run as a centralized service on a single cloud, or as a distributed service over multiple, independent clouds. CloudMine supports a set of basic computations that can be used to construct a variety of highly complex, distributed privacy-preserving data analytics. We demonstrate how a simple instance of CloudMine (secure sum service) is used to implement three classical data mining tasks (classification, association rule mining and clustering) in a cloud environment. We experiment with a prototype of the service, the results of which suggest its practicality for supporting privacy-preserving data analytics as a (multi) cloud-based service.

التشفير والأمن النظم الموزعة والتوازية والحوسبة العنقودية

FED-$chi^2$: Privacy Preserving Federated Correlation Test

101 - Lun Wang , Qi Pang , Shuai Wang 2021

In this paper, we propose the first secure federated $chi^2$-test protocol Fed-$chi^2$. To minimize both the privacy leakage and the communication cost, we recast $chi^2$-test to the second moment estimation problem and thus can take advantage of sta ble projection to encode the local information in a short vector. As such encodings can be aggregated with only summation, secure aggregation can be naturally applied to hide the individual updates. We formally prove the security guarantee of Fed-$chi^2$ that the joint distribution is hidden in a subspace with exponential possible distributions. Our evaluation results show that Fed-$chi^2$ achieves negligible accuracy drops with small client-side computation overhead. In several real-world case studies, the performance of Fed-$chi^2$ is comparable to the centralized $chi^2$-test.

التشفير والأمن النظم الموزعة والتوازية والحوسبة العنقودية

Mitigating Sybil Attacks on Differential Privacy based Federated Learning

183 - Yupeng Jiang , Yong Li , Yipeng Zhou 2020

In federated learning, machine learning and deep learning models are trained globally on distributed devices. The state-of-the-art privacy-preserving technique in the context of federated learning is user-level differential privacy. However, such a m echanism is vulnerable to some specific model poisoning attacks such as Sybil attacks. A malicious adversary could create multiple fake clients or collude compromised devices in Sybil attacks to mount direct model updates manipulation. Recent works on novel defense against model poisoning attacks are difficult to detect Sybil attacks when differential privacy is utilized, as it masks clients model updates with perturbation. In this work, we implement the first Sybil attacks on differential privacy based federated learning architectures and show their impacts on model convergence. We randomly compromise some clients by manipulating different noise levels reflected by the local privacy budget epsilon of differential privacy on the local model updates of these Sybil clients such that the global model convergence rates decrease or even leads to divergence. We apply our attacks to two recent aggregation defense mechanisms, called Krum and Trimmed Mean. Our evaluation results on the MNIST and CIFAR-10 datasets show that our attacks effectively slow down the convergence of the global models. We then propose a method to keep monitoring the average loss of all participants in each round for convergence anomaly detection and defend our Sybil attacks based on the prediction cost reported from each client. Our empirical study demonstrates that our defense approach effectively mitigates the impact of our Sybil attacks on model convergence.

التشفير والأمن النظم الموزعة والتوازية والحوسبة العنقودية

On the Privacy Guarantees of Gossip Protocols in General Networks

110 - Richeng Jin , Yufan Huang , 2019

Recently, the privacy guarantees of information dissemination protocols have attracted increasing research interests, among which the gossip protocols assume vital importance in various information exchange applications. In this work, we study the pr ivacy guarantees of gossip protocols in general networks in terms of differential privacy and prediction uncertainty. First, lower bounds of the differential privacy guarantees are derived for gossip protocols in general networks in both synchronous and asynchronous settings. The prediction uncertainty of the source node given a uniform prior is also determined. For the private gossip algorithm, the differential privacy and prediction uncertainty guarantees are derived in closed form. Moreover, considering that these two metrics may be restrictive in some scenarios, the relaxed variants are proposed. It is found that source anonymity is closely related to some key network structure parameters in the general network setting. Then, we investigate information spreading in wireless networks with unreliable communications, and quantify the tradeoff between differential privacy guarantees and information spreading efficiency. Finally, considering that the attacker may not be present at the beginning of the information dissemination process, the scenario of delayed monitoring is studied and the corresponding differential privacy guarantees are evaluated.

التشفير والأمن النظم الموزعة والتوازية والحوسبة العنقودية