Hidden Backdoor Attack against Semantic Segmentation Models

85 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yiming Li

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yiming Li - Yanjie Li - Yalei Lv

التشفير والأمن الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Deep neural networks (DNNs) are vulnerable to the emph{backdoor attack}, which intends to embed hidden backdoors in DNNs by poisoning training data. The attacked model behaves normally on benign samples, whereas its prediction will be changed to a particular target label if hidden backdoors are activated. So far, backdoor research has mostly been conducted towards classification tasks. In this paper, we reveal that this threat could also happen in semantic segmentation, which may further endanger many mission-critical applications ($e.g.$, autonomous driving). Except for extending the existing attack paradigm to maliciously manipulate the segmentation models from the image-level, we propose a novel attack paradigm, the emph{fine-grained attack}, where we treat the target label ($i.e.$, annotation) from the object-level instead of the image-level to achieve more sophisticated manipulation. In the annotation of poisoned samples generated by the fine-grained attack, only pixels of specific objects will be labeled with the attacker-specified target class while others are still with their ground-truth ones. Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data. Our method not only provides a new perspective for designing novel attacks but also serves as a strong baseline for improving the robustness of semantic segmentation methods.

قيم البحث

164 - Tongqing Zhai , Yiming Li , Ziqi Zhang 2020

Speaker verification has been widely and successfully adopted in many mission-critical areas for user identification. The training of speaker verification requires a large amount of data, therefore users usually need to adopt third-party data ($e.g.$ , data from the Internet or third-party data company). This raises the question of whether adopting untrusted third-party data can pose a security threat. In this paper, we demonstrate that it is possible to inject the hidden backdoor for infecting speaker verification models by poisoning the training data. Specifically, we design a clustering-based attack scheme where poisoned samples from different clusters will contain different triggers ($i.e.$, pre-defined utterances), based on our understanding of verification tasks. The infected models behave normally on benign samples, while attacker-specified unenrolled triggers will successfully pass the verification even if the attacker has no information about the enrolled speaker. We also demonstrate that existing backdoor attacks cannot be directly adopted in attacking speaker verification. Our approach not only provides a new perspective for designing novel attacks, but also serves as a strong baseline for improving the robustness of verification methods. The code for reproducing main results is available at url{https://github.com/zhaitongqing233/Backdoor-attack-against-speaker-verification}.

التشفير والأمن التعلم الآلي أنظمة الصوت في الحاسوب

Backdoor Attack in the Physical World

361 - Yiming Li , Tongqing Zhai , Yong Jiang 2021

Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs), such that the prediction of infected models will be maliciously changed if the hidden backdoor is activated by the attacker-defined trigger. Currently, most exist ing backdoor attacks adopted the setting of static trigger, $i.e.,$ triggers across the training and testing images follow the same appearance and are located in the same area. In this paper, we revisit this attack paradigm by analyzing trigger characteristics. We demonstrate that this attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training. As such, those attacks are far less effective in the physical world, where the location and appearance of the trigger in the digitized image may be different from that of the one used for training. Moreover, we also discuss how to alleviate such vulnerability. We hope that this work could inspire more explorations on backdoor properties, to help the design of more advanced backdoor attack and defense methods.

التشفير والأمن الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning

170 - Lun Wang , Zaynah Javed , Xian Wu 2021

Recent research has confirmed the feasibility of backdoor attacks in deep reinforcement learning (RL) systems. However, the existing attacks require the ability to arbitrarily modify an agents observation, constraining the application scope to simple RL systems such as Atari games. In this paper, we migrate backdoor attacks to more complex RL systems involving multiple agents and explore the possibility of triggering the backdoor without directly manipulating the agents observation. As a proof of concept, we demonstrate that an adversary agent can trigger the backdoor of the victim agent with its own action in two-player competitive RL systems. We prototype and evaluate BACKDOORL in four competitive environments. The results show that when the backdoor is activated, the winning rate of the victim drops by 17% to 37% compared to when not activated.

التشفير والأمن التعلم الآلي

Defending against Backdoor Attack on Deep Neural Networks

133 - Kaidi Xu , Sijia Liu , Pin-Yu Chen 2020

Although deep neural networks (DNNs) have achieved a great success in various computer vision tasks, it is recently found that they are vulnerable to adversarial attacks. In this paper, we focus on the so-called textit{backdoor attack}, which injects a backdoor trigger to a small portion of training data (also known as data poisoning) such that the trained DNN induces misclassification while facing examples with this trigger. To be specific, we carefully study the effect of both real and synthetic backdoor attacks on the internal response of vanilla and backdoored DNNs through the lens of Gard-CAM. Moreover, we show that the backdoor attack induces a significant bias in neuron activation in terms of the $ell_infty$ norm of an activation map compared to its $ell_1$ and $ell_2$ norm. Spurred by our results, we propose the textit{$ell_infty$-based neuron pruning} to remove the backdoor from the backdoored DNN. Experiments show that our method could effectively decrease the attack success rate, and also hold a high classification accuracy for clean images.

التشفير والأمن التعلم الآلي

Light Can Hack Your Face! Black-box Backdoor Attack on Face Recognition Systems

451 - Haoliang Li 2020

Deep neural networks (DNN) have shown great success in many computer vision applications. However, they are also known to be susceptible to backdoor attacks. When conducting backdoor attacks, most of the existing approaches assume that the targeted D NN is always available, and an attacker can always inject a specific pattern to the training data to further fine-tune the DNN model. However, in practice, such attack may not be feasible as the DNN model is encrypted and only available to the secure enclave. In this paper, we propose a novel black-box backdoor attack technique on face recognition systems, which can be conducted without the knowledge of the targeted DNN model. To be specific, we propose a backdoor attack with a novel color stripe pattern trigger, which can be generated by modulating LED in a specialized waveform. We also use an evolutionary computing strategy to optimize the waveform for backdoor attack. Our backdoor attack can be conducted in a very mild condition: 1) the adversary cannot manipulate the input in an unnatural way (e.g., injecting adversarial noise); 2) the adversary cannot access the training database; 3) the adversary has no knowledge of the training model as well as the training set used by the victim party. We show that the backdoor trigger can be quite effective, where the attack success rate can be up to $88%$ based on our simulation study and up to $40%$ based on our physical-domain study by considering the task of face recognition and verification based on at most three-time attempts during authentication. Finally, we evaluate several state-of-the-art potential defenses towards backdoor attacks, and find that our attack can still be effective. We highlight that our study revealed a new physical backdoor attack, which calls for the attention of the security issue of the existing face recognition/verification techniques.

التشفير والأمن الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط