ترغب بنشر مسار تعليمي؟ اضغط هنا

Mitigating Sybil Attacks on Differential Privacy based Federated Learning

184   0   0.0 ( 0 )
 نشر من قبل Yupeng Jiang
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

In federated learning, machine learning and deep learning models are trained globally on distributed devices. The state-of-the-art privacy-preserving technique in the context of federated learning is user-level differential privacy. However, such a mechanism is vulnerable to some specific model poisoning attacks such as Sybil attacks. A malicious adversary could create multiple fake clients or collude compromised devices in Sybil attacks to mount direct model updates manipulation. Recent works on novel defense against model poisoning attacks are difficult to detect Sybil attacks when differential privacy is utilized, as it masks clients model updates with perturbation. In this work, we implement the first Sybil attacks on differential privacy based federated learning architectures and show their impacts on model convergence. We randomly compromise some clients by manipulating different noise levels reflected by the local privacy budget epsilon of differential privacy on the local model updates of these Sybil clients such that the global model convergence rates decrease or even leads to divergence. We apply our attacks to two recent aggregation defense mechanisms, called Krum and Trimmed Mean. Our evaluation results on the MNIST and CIFAR-10 datasets show that our attacks effectively slow down the convergence of the global models. We then propose a method to keep monitoring the average loss of all participants in each round for convergence anomaly detection and defend our Sybil attacks based on the prediction cost reported from each client. Our empirical study demonstrates that our defense approach effectively mitigates the impact of our Sybil attacks on model convergence.

قيم البحث

اقرأ أيضاً

Federated learning (FL) has emerged as a promising privacy-aware paradigm that allows multiple clients to jointly train a model without sharing their private data. Recently, many studies have shown that FL is vulnerable to membership inference attack s (MIAs) that can distinguish the training members of the given model from the non-members. However, existing MIAs ignore the source of a training member, i.e., the information of which client owns the training member, while it is essential to explore source privacy in FL beyond membership privacy of examples from all clients. The leakage of source information can lead to severe privacy issues. For example, identification of the hospital contributing to the training of an FL model for COVID-19 pandemic can render the owner of a data record from this hospital more prone to discrimination if the hospital is in a high risk region. In this paper, we propose a new inference attack called source inference attack (SIA), which can derive an optimal estimation of the source of a training member. Specifically, we innovatively adopt the Bayesian perspective to demonstrate that an honest-but-curious server can launch an SIA to steal non-trivial source information of the training members without violating the FL protocol. The server leverages the prediction loss of local models on the training members to achieve the attack effectively and non-intrusively. We conduct extensive experiments on one synthetic and five real datasets to evaluate the key factors in an SIA, and the results show the efficacy of the proposed source inference attack.
121 - Yao Fu , Yipeng Zhou , Di Wu 2021
In spite that Federated Learning (FL) is well known for its privacy protection when training machine learning models among distributed clients collaboratively, recent studies have pointed out that the naive FL is susceptible to gradient leakage attac ks. In the meanwhile, Differential Privacy (DP) emerges as a promising countermeasure to defend against gradient leakage attacks. However, the adoption of DP by clients in FL may significantly jeopardize the model accuracy. It is still an open problem to understand the practicality of DP from a theoretic perspective. In this paper, we make the first attempt to understand the practicality of DP in FL through tuning the number of conducted iterations. Based on the FedAvg algorithm, we formally derive the convergence rate with DP noises in FL. Then, we theoretically derive: 1) the conditions for the DP based FedAvg to converge as the number of global iterations (GI) approaches infinity; 2) the method to set the number of local iterations (LI) to minimize the negative influence of DP noises. By further substituting the Laplace and Gaussian mechanisms into the derived convergence rate respectively, we show that: 3) The DP based FedAvg with the Laplace mechanism cannot converge, but the divergence rate can be effectively prohibited by setting the number of LIs with our method; 4) The learning error of the DP based FedAvg with the Gaussian mechanism can converge to a constant number finally if we use a fixed number of LIs per GI. To verify our theoretical findings, we conduct extensive experiments using two real-world datasets. The results not only validate our analysis results, but also provide useful guidelines on how to optimize model accuracy when incorporating DP into FL
Federated Learning (FL) allows multiple participants to train machine learning models collaboratively by keeping their datasets local and only exchanging model updates. Alas, recent work highlighted several privacy and robustness weaknesses in FL, pr esenting, respectively, membership/property inference and backdoor attacks. In this paper, we investigate to what extent Differential Privacy (DP) can be used to protect not only privacy but also robustness in FL. We present a first-of-its-kind empirical evaluation of Local and Central Differential Privacy (LDP/CDP) techniques in FL, assessing their feasibility and effectiveness. We show that both DP variants do defend against backdoor attacks, with varying levels of protection and utility, and overall much more effectively than previously proposed defenses. They also mitigate white-box membership inference attacks in FL, and our work is the first to show how effectively; neither, however, provides viable defenses against property inference. Our work also provides a re-usable measurement framework to quantify the trade-offs between robustness/privacy and utility in differentially private FL.
75 - Gan Sun , Yang Cong 2020
Federated machine learning which enables resource constrained node devices (e.g., mobile phones and IoT devices) to learn a shared model while keeping the training data local, can provide privacy, security and economic benefits by designing an effect ive communication protocol. However, the communication protocol amongst different nodes could be exploited by attackers to launch data poisoning attacks, which has been demonstrated as a big threat to most machine learning models. In this paper, we attempt to explore the vulnerability of federated machine learning. More specifically, we focus on attacking a federated multi-task learning framework, which is a federated learning framework via adopting a general multi-task learning framework to handle statistical challenges. We formulate the problem of computing optimal poisoning attacks on federated multi-task learning as a bilevel program that is adaptive to arbitrary choice of target nodes and source attacking nodes. Then we propose a novel systems-aware optimization method, ATTack on Federated Learning (AT2FL), which is efficiency to derive the implicit gradients for poisoned data, and further compute optimal attack strategies in the federated machine learning. Our work is an earlier study that considers issues of data poisoning attack for federated learning. To the end, experimental results on real-world datasets show that federated multi-task learning model is very sensitive to poisoning attacks, when the attackers either directly poison the target nodes or indirectly poison the related nodes by exploiting the communication protocol.
Federated learning (FL) is an emerging paradigm that enables multiple organizations to jointly train a model without revealing their private data to each other. This paper studies {it vertical} federated learning, which tackles the scenarios where (i ) collaborating organizations own data of the same set of users but with disjoint features, and (ii) only one organization holds the labels. We propose Pivot, a novel solution for privacy preserving vertical decision tree training and prediction, ensuring that no intermediate information is disclosed other than those the clients have agreed to release (i.e., the final tree model and the prediction output). Pivot does not rely on any trusted third party and provides protection against a semi-honest adversary that may compromise $m-1$ out of $m$ clients. We further identify two privacy leakages when the trained decision tree model is released in plaintext and propose an enhanced protocol to mitigate them. The proposed solution can also be extended to tree ensemble models, e.g., random forest (RF) and gradient boosting decision tree (GBDT) by treating single decision trees as building blocks. Theoretical and experimental analysis suggest that Pivot is efficient for the privacy achieved.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا