Turning Your Strength against You: Detecting and Mitigating Robust and Universal Adversarial Patch Attack

93 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Zitao Chen

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Zitao Chen - Pritam Dash - Karthik Pattabiraman

التشفير والأمن الذكاء الاصطناعي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Adversarial patch attack against image classification deep neural networks (DNNs), in which the attacker can inject arbitrary distortions within a bounded region of an image, is able to generate adversarial perturbations that are robust (i.e., remain adversarial in physical world) and universal (i.e., remain adversarial on any input). It is thus important to detect and mitigate such attack to ensure the security of DNNs. This work proposes Jujutsu, a technique to detect and mitigate robust and universal adversarial patch attack. Jujutsu leverages the universal property of the patch attack for detection. It uses explainable AI technique to identify suspicious features that are potentially malicious, and verify their maliciousness by transplanting the suspicious features to new images. An adversarial patch continues to exhibit the malicious behavior on the new images and thus can be detected based on prediction consistency. Jujutsu leverages the localized nature of the patch attack for mitigation, by randomly masking the suspicious features to remove adversarial perturbations. However, the network might fail to classify the images as some of the contents are removed (masked). Therefore, Jujutsu uses image inpainting for synthesizing alternative contents from the pixels that are masked, which can reconstruct the clean image for correct prediction. We evaluate Jujutsu on five DNNs on two datasets, and show that Jujutsu achieves superior performance and significantly outperforms existing techniques. Jujutsu can further defend against various variants of the basic attack, including 1) physical-world attack; 2) attacks that target diverse classes; 3) attacks that use patches in different shapes and 4) adaptive attacks.

قيم البحث

384 - Yusheng Zhao , Huanqian Yan , Xingxing Wei 2020

Deep neural networks have been widely used in many computer vision tasks. However, it is proved that they are susceptible to small, imperceptible perturbations added to the input. Inputs with elaborately designed perturbations that can fool deep lear ning models are called adversarial examples, and they have drawn great concerns about the safety of deep neural networks. Object detection algorithms are designed to locate and classify objects in images or videos and they are the core of many computer vision tasks, which have great research value and wide applications. In this paper, we focus on adversarial attack on some state-of-the-art object detection models. As a practical alternative, we use adversarial patches for the attack. Two adversarial patch generation algorithms have been proposed: the heatmap-based algorithm and the consensus-based algorithm. The experiment results have shown that the proposed methods are highly effective, transferable and generic. Additionally, we have applied the proposed methods to competition Adversarial Challenge on Object Detection that is organized by Alibaba on the Tianchi platform and won top 7 in 1701 teams. Code is available at: https://github.com/FenHua/DetDak

الرؤية الحاسوبية وتمييز الأنماط

OFEI: A Semi-black-box Android Adversarial Sample Attack Framework Against DLaaS

99 - Guangquan Xu , GuoHua Xin , Litao Jiao 2021

With the growing popularity of Android devices, Android malware is seriously threatening the safety of users. Although such threats can be detected by deep learning as a service (DLaaS), deep neural networks as the weakest part of DLaaS are often dec eived by the adversarial samples elaborated by attackers. In this paper, we propose a new semi-black-box attack framework called one-feature-each-iteration (OFEI) to craft Android adversarial samples. This framework modifies as few features as possible and requires less classifier information to fool the classifier. We conduct a controlled experiment to evaluate our OFEI framework by comparing it with the benchmark methods JSMF, GenAttack and pointwise attack. The experimental results show that our OFEI has a higher misclassification rate of 98.25%. Furthermore, OFEI can extend the traditional white-box attack methods in the image field, such as fast gradient sign method (FGSM) and DeepFool, to craft adversarial samples for Android. Finally, to enhance the security of DLaaS, we use two uncertainties of the Bayesian neural network to construct the combined uncertainty, which is used to detect adversarial samples and achieves a high detection rate of 99.28%.

التشفير والأمن الذكاء الاصطناعي

Bio-Inspired Adversarial Attack Against Deep Neural Networks

166 - Bowei Xi , Yujie Chen , Fan Fei 2021

The paper develops a new adversarial attack against deep neural networks (DNN), based on applying bio-inspired design to moving physical objects. To the best of our knowledge, this is the first work to introduce physical attacks with a moving object. Instead of following the dominating attack strategy in the existing literature, i.e., to introduce minor perturbations to a digital input or a stationary physical object, we show two new successful attack strategies in this paper. We show by superimposing several patterns onto one physical object, a DNN becomes confused and picks one of the patterns to assign a class label. Our experiment with three flapping wing robots demonstrates the possibility of developing an adversarial camouflage to cause a targeted mistake by DNN. We also show certain motion can reduce the dependency among consecutive frames in a video and make an object detector blind, i.e., not able to detect an object exists in the video. Hence in a successful physical attack against DNN, targeted motion against the system should also be considered.

التشفير والأمن التعلم الآلي

A Synergetic Attack against Neural Network Classifiers combining Backdoor and Adversarial Examples

92 - Guanxiong Liu , Issa Khalil , Abdallah Khreishah 2021

In this work, we show how to jointly exploit adversarial perturbation and model poisoning vulnerabilities to practically launch a new stealthy attack, dubbed AdvTrojan. AdvTrojan is stealthy because it can be activated only when: 1) a carefully craft ed adversarial perturbation is injected into the input examples during inference, and 2) a Trojan backdoor is implanted during the training process of the model. We leverage adversarial noise in the input space to move Trojan-infected examples across the model decision boundary, making it difficult to detect. The stealthiness behavior of AdvTrojan fools the users into accidentally trust the infected model as a robust classifier against adversarial examples. AdvTrojan can be implemented by only poisoning the training data similar to conventional Trojan backdoor attacks. Our thorough analysis and extensive experiments on several benchmark datasets show that AdvTrojan can bypass existing defenses with a success rate close to 100% in most of our experimental scenarios and can be extended to attack federated learning tasks as well.

التشفير والأمن التعلم الآلي

Bias-based Universal Adversarial Patch Attack for Automatic Check-out

371 - Aishan Liu , Jiakai Wang , Xianglong Liu 2020

Adversarial examples are inputs with imperceptible perturbations that easily misleading deep neural networks(DNNs). Recently, adversarial patch, with noise confined to a small and localized patch, has emerged for its easy feasibility in real-world sc enarios. However, existing strategies failed to generate adversarial patches with strong generalization ability. In other words, the adversarial patches were input-specific and failed to attack images from all classes, especially unseen ones during training. To address the problem, this paper proposes a bias-based framework to generate class-agnostic universal adversarial patches with strong generalization ability, which exploits both the perceptual and semantic bias of models. Regarding the perceptual bias, since DNNs are strongly biased towards textures, we exploit the hard examples which convey strong model uncertainties and extract a textural patch prior from them by adopting the style similarities. The patch prior is more close to decision boundaries and would promote attacks. To further alleviate the heavy dependency on large amounts of data in training universal attacks, we further exploit the semantic bias. As the class-wise preference, prototypes are introduced and pursued by maximizing the multi-class margin to help universal training. Taking AutomaticCheck-out (ACO) as the typical scenario, extensive experiments including white-box and black-box settings in both digital-world(RPC, the largest ACO related dataset) and physical-world scenario(Taobao and JD, the world s largest online shopping platforms) are conducted. Experimental results demonstrate that our proposed framework outperforms state-of-the-art adversarial patch attack methods.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي معالجة الصور والفيديو