Byzantine-Robust and Privacy-Preserving Framework for FedML

257 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Hanieh Hashemi

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Hanieh Hashemi - Yongqin Wang - Chuan Guo

التشفير والأمن هندسة العتاد التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Federated learning has emerged as a popular paradigm for collaboratively training a model from data distributed among a set of clients. This learning setting presents, among others, two unique challenges: how to protect privacy of the clients data during training, and how to ensure integrity of the trained model. We propose a two-pronged solution that aims to address both challenges under a single framework. First, we propose to create secure enclaves using a trusted execution environment (TEE) within the server. Each client can then encrypt their gradients and send them to verifiable enclaves. The gradients are decrypted within the enclave without the fear of privacy breaches. However, robustness check computations in a TEE are computationally prohibitive. Hence, in the second step, we perform a novel gradient encoding that enables TEEs to encode the gradients and then offloading Byzantine check computations to accelerators such as GPUs. Our proposed approach provides theoretical bounds on information leakage and offers a significant speed-up over the baseline in empirical evaluation.

قيم البحث

165 - Hanieh Hashemi , Yongqin Wang , Murali Annavaram 2021

Privacy and security-related concerns are growing as machine learning reaches diverse application domains. The data holders want to train with private data while exploiting accelerators, such as GPUs, that are hosted in the cloud. However, Cloud syst ems are vulnerable to attackers that compromise the privacy of data and integrity of computations. This work presents DarKnight, a framework for large DNN training while protecting input privacy and computation integrity. DarKnight relies on cooperative execution between trusted execution environments (TEE) and accelerators, where the TEE provides privacy and integrity verification, while accelerators perform the computation heavy linear algebraic operations.

التشفير والأمن هندسة العتاد التعلم الآلي

Robust Aggregation for Adaptive Privacy Preserving Federated Learning in Healthcare

212 - Matei Grama , Maria Musat , Luis Mu~noz-Gonzalez 2020

Federated learning (FL) has enabled training models collaboratively from multiple data owning parties without sharing their data. Given the privacy regulations of patients healthcare data, learning-based systems in healthcare can greatly benefit from privacy-preserving FL approaches. However, typical model aggregation methods in FL are sensitive to local model updates, which may lead to failure in learning a robust and accurate global model. In this work, we implement and evaluate different robust aggregation methods in FL applied to healthcare data. Furthermore, we show that such methods can detect and discard faulty or malicious local clients during training. We run two sets of experiments using two real-world healthcare datasets for training medical diagnosis classification tasks. Each dataset is used to simulate the performance of three different robust FL aggregation strategies when facing different poisoning attacks. The results show that privacy preserving methods can be successfully applied alongside Byzantine-robust aggregation techniques. We observed in particular how using differential privacy (DP) did not significantly impact the final learning convergence of the different aggregation strategies.

التشفير والأمن

Plausible Deniability for Privacy-Preserving Data Synthesis

130 - Vincent Bindschaedler , Reza Shokri , Carl A. Gunter 2017

Releasing full data records is one of the most challenging problems in data privacy. On the one hand, many of the popular techniques such as data de-identification are problematic because of their dependence on the background knowledge of adversaries . On the other hand, rigorous methods such as the exponential mechanism for differential privacy are often computationally impractical to use for releasing high dimensional data or cannot preserve high utility of original data due to their extensive data perturbation. This paper presents a criterion called plausible deniability that provides a formal privacy guarantee, notably for releasing sensitive datasets: an output record can be released only if a certain amount of input records are indistinguishable, up to a privacy parameter. This notion does not depend on the background knowledge of an adversary. Also, it can efficiently be checked by privacy tests. We present mechanisms to generate synthetic datasets with similar statistical properties to the input data and the same format. We study this technique both theoretically and experimentally. A key theoretical result shows that, with proper randomization, the plausible deniability mechanism generates differentially private synthetic data. We demonstrate the efficiency of this generative technique on a large dataset; it is shown to preserve the utility of original data with respect to various statistical analysis and machine learning measures.

التشفير والأمن قواعد البيانات التعلم الآلي

Utility-aware Privacy-preserving Data Releasing

275 - Di Zhuang , J. Morris Chang 2020

In the big data era, more and more cloud-based data-driven applications are developed that leverage individual data to provide certain valuable services (the utilities). On the other hand, since the same set of individual data could be utilized to in fer the individuals certain sensitive information, it creates new channels to snoop the individuals privacy. Hence it is of great importance to develop techniques that enable the data owners to release privatized data, that can still be utilized for certain premised intended purpose. Existing data releasing approaches, however, are either privacy-emphasized (no consideration on utility) or utility-driven (no guarantees on privacy). In this work, we propose a two-step perturbation-based utility-aware privacy-preserving data releasing framework. First, certain predefined privacy and utility problems are learned from the public domain data (background knowledge). Later, our approach leverages the learned knowledge to precisely perturb the data owners data into privatized data that can be successfully utilized for certain intended purpose (learning to succeed), without jeopardizing certain predefined privacy (training to fail). Extensive experiments have been conducted on Human Activity Recognition, Census Income and Bank Marketing datasets to demonstrate the effectiveness and practicality of our framework.

التشفير والأمن الذكاء الاصطناعي التعلم الآلي

Survey of Privacy-Preserving Collaborative Filtering

395 - Islam Elnabarawy , Wei Jiang , Donald C. Wunsch II 2020

Collaborative filtering recommendation systems provide recommendations to users based on their own past preferences, as well as those of other users who share similar interests. The use of recommendation systems has grown widely in recent years, help ing people choose which movies to watch, books to read, and items to buy. However, users are often concerned about their privacy when using such systems, and many users are reluctant to provide accurate information to most online services. Privacy-preserving collaborative filtering recommendation systems aim to provide users with accurate recommendations while maintaining certain guarantees about the privacy of their data. This survey examines the recent literature in privacy-preserving collaborative filtering, providing a broad perspective of the field and classifying the key contributions in the literature using two different criteria: the type of vulnerability they address and the type of approach they use to solve it.

التشفير والأمن استرجاع المعلومات التعلم الآلي