CRFL: Certifiably Robust Federated Learning against Backdoor Attacks

128 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Chulin Xie

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Chulin Xie - Minghao Chen - Pin-Yu Chen

التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Federated Learning (FL) as a distributed learning paradigm that aggregates information from diverse clients to train a shared global model, has demonstrated great success. However, malicious clients can perform poisoning attacks and model replacement to introduce backdoors into the trained global model. Although there have been intensive studies designing robust aggregation methods and empirical robust federated training protocols against backdoors, existing approaches lack robustness certification. This paper provides the first general framework, Certifiably Robust Federated Learning (CRFL), to train certifiably robust FL models against backdoors. Our method exploits clipping and smoothing on model parameters to control the global model smoothness, which yields a sample-wise robustness certification on backdoors with limited magnitude. Our certification also specifies the relation to federated learning parameters, such as poisoning ratio on instance level, number of attackers, and training iterations. Practically, we conduct comprehensive experiments across a range of federated datasets, and provide the first benchmark for certified robustness against backdoor attacks in federated learning. Our code is available at https://github.com/AI-secure/CRFL.

قيم البحث

125 - Jonathan Hayase , Weihao Kong , Raghav Somani 2021

Modern machine learning increasingly requires training on a large collection of data from multiple sources, not all of which can be trusted. A particularly concerning scenario is when a small fraction of poisoned data changes the behavior of the trai ned model when triggered by an attacker-specified watermark. Such a compromised model will be deployed unnoticed as the model is accurate otherwise. There have been promising attempts to use the intermediate representations of such a model to separate corrupted examples from clean ones. However, these defenses work only when a certain spectral signature of the poisoned examples is large enough for detection. There is a wide range of attacks that cannot be protected against by the existing defenses. We propose a novel defense algorithm using robust covariance estimation to amplify the spectral signature of corrupted data. This defense provides a clean model, completely removing the backdoor, even in regimes where previous methods have no hope of detecting the poisoned examples. Code and pre-trained models are available at https://github.com/SewoongLab/spectre-defense .

التعلم الآلي الذكاء الاصطناعي التعلم الالي

Certifiably-Robust Federated Adversarial Learning via Randomized Smoothing

82 - Cheng Chen , Bhavya Kailkhura , Ryan Goldhahn 2021

Federated learning is an emerging data-private distributed learning framework, which, however, is vulnerable to adversarial attacks. Although several heuristic defenses are proposed to enhance the robustness of federated learning, they do not provide certifiable robustness guarantees. In this paper, we incorporate randomized smoothing techniques into federated adversarial training to enable data-private distributed learning with certifiable robustness to test-time adversarial perturbations. Our experiments show that such an advanced federated adversarial learning framework can deliver models as robust as those trained by the centralized training. Further, this enables provably-robust classifiers to $ell_2$-bounded adversarial perturbations in a distributed setup. We also show that one-point gradient estimation based training approach is $2-3times$ faster than popular stochastic estimator based approach without any noticeable certified robustness differences.

التعلم الآلي

RAB: Provable Robustness Against Backdoor Attacks

122 - Maurice Weber , Xiaojun Xu , Bojan Karlav{s} 2020

Recent studies have shown that deep neural networks (DNNs) are highly vulnerable to adversarial attacks, including evasion and backdoor (poisoning) attacks. On the defense side, there have been intensive interests in both empirical and provable robus tness against evasion attacks; however, provable robustness against backdoor attacks remains largely unexplored. In this paper, we focus on certifying robustness against backdoor attacks. To this end, we first provide a unified framework for robustness certification and show that it leads to a tight robustness condition for backdoor attacks. We then propose the first robust training process, RAB, to smooth the trained model and certify its robustness against backdoor attacks. Moreover, we evaluate the certified robustness of a family of smoothed models which are trained in a differentially private fashion, and show that they achieve better certified robustness bounds. In addition, we theoretically show that it is possible to train the robust smoothed models efficiently for simple models such as K-nearest neighbor classifiers, and we propose an exact smooth-training algorithm which eliminates the need to sample from a noise distribution. Empirically, we conduct comprehensive experiments for different machine learning (ML) models such as DNNs, differentially private DNNs, and K-NN models on MNIST, CIFAR-10 and ImageNet datasets (focusing on binary classifiers), and provide the first benchmark for certified robustness against backdoor attacks. In addition, we evaluate K-NN models on a spambase tabular dataset to demonstrate the advantages of the proposed exact algorithm. Both the theoretical analysis and the comprehensive benchmark on diverse ML models and datasets shed lights on further robust learning strategies against training time attacks or other general adversarial attacks.

التعلم الآلي التعلم الالي

Explainability-based Backdoor Attacks Against Graph Neural Networks

91 - Jing Xu , Minhui Xue , Stjepan Picek 2021

Backdoor attacks represent a serious threat to neural network models. A backdoored model will misclassify the trigger-embedded inputs into an attacker-chosen target label while performing normally on other benign inputs. There are already numerous wo rks on backdoor attacks on neural networks, but only a few works consider graph neural networks (GNNs). As such, there is no intensive research on explaining the impact of trigger injecting position on the performance of backdoor attacks on GNNs. To bridge this gap, we conduct an experimental investigation on the performance of backdoor attacks on GNNs. We apply two powerful GNN explainability approaches to select the optimal trigger injecting position to achieve two attacker objectives -- high attack success rate and low clean accuracy drop. Our empirical results on benchmark datasets and state-of-the-art neural network models demonstrate the proposed methods effectiveness in selecting trigger injecting position for backdoor attacks on GNNs. For instance, on the node classification task, the backdoor attack with trigger injecting position selected by GraphLIME reaches over $84 %$ attack success rate with less than $2.5 %$ accuracy drop

التعلم الآلي التشفير والأمن

Robust Attacks against Multiple Classifiers

146 - Juan C. Perdomo , Yaron Singer 2019

We address the challenge of designing optimal adversarial noise algorithms for settings where a learner has access to multiple classifiers. We demonstrate how this problem can be framed as finding strategies at equilibrium in a two-player, zero-sum g ame between a learner and an adversary. In doing so, we illustrate the need for randomization in adversarial attacks. In order to compute Nash equilibrium, our main technical focus is on the design of best response oracles that can then be implemented within a Multiplicative Weights Update framework to boost deterministic perturbations against a set of models into optimal mixed strategies. We demonstrate the practical effectiveness of our approach on a series of image classification tasks using both linear classifiers and deep neural networks.

التعلم الآلي التشفير والأمن علوم الكمبيوتر ونظرية الألعاب