An Accuracy-Lossless Perturbation Method for Defending Privacy Attacks in Federated Learning

98 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yan Feng

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Xue Yang - Yan Feng - Weijun Fang

التعلم الآلي التشفير والأمن الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Although federated learning improves privacy of training data by exchanging local gradients or parameters rather than raw data, the adversary still can leverage local gradients and parameters to obtain local training data by launching reconstruction and membership inference attacks. To defend such privacy attacks, many noises perturbation methods (like differential privacy or CountSketch matrix) have been widely designed. However, the strong defence ability and high learning accuracy of these schemes cannot be ensured at the same time, which will impede the wide application of FL in practice (especially for medical or financial institutions that require both high accuracy and strong privacy guarantee). To overcome this issue, in this paper, we propose emph{an efficient model perturbation method for federated learning} to defend reconstruction and membership inference attacks launched by curious clients. On the one hand, similar to the differential privacy, our method also selects random numbers as perturbed noises added to the global model parameters, and thus it is very efficient and easy to be integrated in practice. Meanwhile, the random selected noises are positive real numbers and the corresponding value can be arbitrarily large, and thus the strong defence ability can be ensured. On the other hand, unlike differential privacy or other perturbation methods that cannot eliminate the added noises, our method allows the server to recover the true gradients by eliminating the added noises. Therefore, our method does not hinder learning accuracy at all.

قيم البحث

110 - Jiankai Sun , Yuanshun Yao , Weihao Gao 2021

Recently researchers have studied input leakage problems in Federated Learning (FL) where a malicious party can reconstruct sensitive training inputs provided by users from shared gradient. It raises concerns about FL since input leakage contradicts the privacy-preserving intention of using FL. Despite a relatively rich literature on attacks and defenses of input reconstruction in Horizontal FL, input leakage and protection in vertical FL starts to draw researchers attention recently. In this paper, we study how to defend against input leakage attacks in Vertical FL. We design an adversarial training-based framework that contains three modules: adversarial reconstruction, noise regularization, and distance correlation minimization. Those modules can not only be employed individually but also applied together since they are independent to each other. Through extensive experiments on a large-scale industrial online advertising dataset, we show our framework is effective in protecting input privacy while retaining the model utility.

التعلم الآلي التشفير والأمن

FedXGBoost: Privacy-Preserving XGBoost for Federated Learning

186 - Nhan Khanh Le , Yang Liu , Quang Minh Nguyen 2021

Federated learning is the distributed machine learning framework that enables collaborative training across multiple parties while ensuring data privacy. Practical adaptation of XGBoost, the state-of-the-art tree boosting framework, to federated lear ning remains limited due to high cost incurred by conventional privacy-preserving methods. To address the problem, we propose two variants of federated XGBoost with privacy guarantee: FedXGBoost-SMM and FedXGBoost-LDP. Our first protocol FedXGBoost-SMM deploys enhanced secure matrix multiplication method to preserve privacy with lossless accuracy and lower overhead than encryption-based techniques. Developed independently, the second protocol FedXGBoost-LDP is heuristically designed with noise perturbation for local differential privacy, and empirically evaluated on real-world and synthetic datasets.

التعلم الآلي التشفير والأمن

Federated Learning with Bayesian Differential Privacy

352 - Aleksei Triastcyn , Boi Faltings 2019

We consider the problem of reinforcing federated learning with formal privacy guarantees. We propose to employ Bayesian differential privacy, a relaxation of differential privacy for similarly distributed data, to provide sharper privacy loss bounds. We adapt the Bayesian privacy accounting method to the federated setting and suggest multiple improvements for more efficient privacy budgeting at different levels. Our experiments show significant advantage over the state-of-the-art differential privacy bounds for federated learning on image classification tasks, including a medical application, bringing the privacy budget below 1 at the client level, and below 0.1 at the instance level. Lower amounts of noise also benefit the model accuracy and reduce the number of communication rounds.

التعلم الآلي التشفير والأمن النظم الموزعة والتوازية والحوسبة العنقودية

A Framework for Evaluating Gradient Leakage Attacks in Federated Learning

89 - Wenqi Wei , Ling Liu , Margaret Loper 2020

Federated learning (FL) is an emerging distributed machine learning framework for collaborative model training with a network of clients (edge devices). FL offers default client privacy by allowing clients to keep their sensitive data on local device s and to only share local training parameter updates with the federated server. However, recent studies have shown that even sharing local parameter updates from a client to the federated server may be susceptible to gradient leakage attacks and intrude the client privacy regarding its training data. In this paper, we present a principled framework for evaluating and comparing different forms of client privacy leakage attacks. We first provide formal and experimental analysis to show how adversaries can reconstruct the private local training data by simply analyzing the shared parameter update from local training (e.g., local gradient or weight update vector). We then analyze how different hyperparameter configurations in federated learning and different settings of the attack algorithm may impact on both attack effectiveness and attack cost. Our framework also measures, evaluates, and analyzes the effectiveness of client privacy leakage attacks under different gradient compression ratios when using communication efficient FL protocols. Our experiments also include some preliminary mitigation strategies to highlight the importance of providing a systematic attack evaluation framework towards an in-depth understanding of the various forms of client privacy leakage threats in federated learning and developing theoretical foundations for attack mitigation.

التعلم الآلي التشفير والأمن التعلم الالي

Privacy-Preserving Self-Taught Federated Learning for Heterogeneous Data

105 - Kai-Fung Chu , Lintao Zhang 2021

Many application scenarios call for training a machine learning model among multiple participants. Federated learning (FL) was proposed to enable joint training of a deep learning model using the local data in each party without revealing the data to others. Among various types of FL methods, vertical FL is a category to handle data sources with the same ID space and different feature spaces. However, existing vertical FL methods suffer from limitations such as restrictive neural network structure, slow training speed, and often lack the ability to take advantage of data with unmatched IDs. In this work, we propose an FL method called self-taught federated learning to address the aforementioned issues, which uses unsupervised feature extraction techniques for distributed supervised deep learning tasks. In this method, only latent variables are transmitted to other parties for model training, while privacy is preserved by storing the data and parameters of activations, weights, and biases locally. Extensive experiments are performed to evaluate and demonstrate the validity and efficiency of the proposed method.

التعلم الآلي التشفير والأمن