Improving Robustness of Adversarial Attacks Using an Affine-Invariant Gradient Estimator

110 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Wenzhao Xiang

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Wenzhao Xiang - Hang Su - Chang Liu

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Adversarial examples can deceive a deep neural network (DNN) by significantly altering its response with imperceptible perturbations, which poses new potential vulnerabilities as the growing ubiquity of DNNs. However, most of the existing adversarial examples cannot maintain the malicious functionality if we apply an affine transformation on the resultant examples, which is an important measurement to the robustness of adversarial attacks for the practical risks. To address this issue, we propose an affine-invariant adversarial attack which can consistently construct adversarial examples robust over a distribution of affine transformation. To further improve the efficiency, we propose to disentangle the affine transformation into rotations, translations, magnifications, and reformulate the transformation in polar space. Afterwards, we construct an affine-invariant gradient estimator by convolving the gradient at the original image with derived kernels, which can be integrated with any gradient-based attack methods. Extensive experiments on the ImageNet demonstrate that our method can consistently produce more robust adversarial examples under significant affine transformations, and as a byproduct, improve the transferability of adversarial examples compared with the alternative state-of-the-art methods.

قيم البحث

146 - Anh Bui , Trung Le , He Zhao 2020

Ensemble-based adversarial training is a principled approach to achieve robustness against adversarial attacks. An important technique of this approach is to control the transferability of adversarial examples among ensemble members. We propose in th is work a simple yet effective strategy to collaborate among committee models of an ensemble model. This is achieved via the secure and insecure sets defined for each model member on a given sample, hence help us to quantify and regularize the transferability. Consequently, our proposed framework provides the flexibility to reduce the adversarial transferability as well as to promote the diversity of ensemble members, which are two crucial factors for better robustness in our ensemble approach. We conduct extensive and comprehensive experiments to demonstrate that our proposed method outperforms the state-of-the-art ensemble baselines, at the same time can detect a wide range of adversarial examples with a nearly perfect accuracy.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Class-Aware Domain Adaptation for Improving Adversarial Robustness

170 - Xianxu Hou , Jingxin Liu , Bolei Xu 2020

Recent works have demonstrated convolutional neural networks are vulnerable to adversarial examples, i.e., inputs to machine learning models that an attacker has intentionally designed to cause the models to make a mistake. To improve the adversarial robustness of neural networks, adversarial training has been proposed to train networks by injecting adversarial examples into the training data. However, adversarial training could overfit to a specific type of adversarial attack and also lead to standard accuracy drop on clean images. To this end, we propose a novel Class-Aware Domain Adaptation (CADA) method for adversarial defense without directly applying adversarial training. Specifically, we propose to learn domain-invariant features for adversarial examples and clean images via a domain discriminator. Furthermore, we introduce a class-aware component into the discriminator to increase the discriminative power of the network for adversarial examples. We evaluate our newly proposed approach using multiple benchmark datasets. The results demonstrate that our method can significantly improve the state-of-the-art of adversarial robustness for various attacks and maintain high performances on clean images.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Improved Gradient based Adversarial Attacks for Quantized Networks

82 - Kartik Gupta , Thalaiyasingam Ajanthan 2020

Neural network quantization has become increasingly popular due to efficient memory consumption and faster computation resulting from bitwise operations on the quantized networks. Even though they exhibit excellent generalization capabilities, their robustness properties are not well-understood. In this work, we systematically study the robustness of quantized networks against gradient based adversarial attacks and demonstrate that these quantized models suffer from gradient vanishing issues and show a fake sense of security. By attributing gradient vanishing to poor forward-backward signal propagation in the trained network, we introduce a simple temperature scaling approach to mitigate this issue while preserving the decision boundary. Despite being a simple modification to existing gradient based adversarial attacks, experiments on CIFAR-10/100 datasets with VGG-16 and ResNet-18 networks demonstrate that our temperature scaled attacks obtain near-perfect success rate on quantized networks while outperforming original attacks on adversarially trained models as well as floating-point networks.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Improving Global Adversarial Robustness Generalization With Adversarially Trained GAN

125 - Desheng Wang 2021

Convolutional neural networks (CNNs) have achieved beyond human-level accuracy in the image classification task and are widely deployed in real-world environments. However, CNNs show vulnerability to adversarial perturbations that are well-designed n oises aiming to mislead the classification models. In order to defend against the adversarial perturbations, adversarially trained GAN (ATGAN) is proposed to improve the adversarial robustness generalization of the state-of-the-art CNNs trained by adversarial training. ATGAN incorporates adversarial training into standard GAN training procedure to remove obfuscated gradients which can lead to a false sense in defending against the adversarial perturbations and are commonly observed in existing GANs-based adversarial defense methods. Moreover, ATGAN adopts the image-to-image generator as data augmentation to increase the sample complexity needed for adversarial robustness generalization in adversarial training. Experimental results in MNIST SVHN and CIFAR-10 datasets show that the proposed method doesnt rely on obfuscated gradients and achieves better global adversarial robustness generalization performance than the adversarially trained state-of-the-art CNNs.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Augmenting Model Robustness with Transformation-Invariant Attacks

71 - Houpu Yao , Zhe Wang , Guangyu Nie 2019

The vulnerability of neural networks under adversarial attacks has raised serious concerns and motivated extensive research. It has been shown that both neural networks and adversarial attacks against them can be sensitive to input transformations su ch as linear translation and rotation, and that human vision, which is robust against adversarial attacks, is invariant to natural input transformations. Based on these, this paper tests the hypothesis that model robustness can be further improved when it is adversarially trained against transformed attacks and transformation-invariant attacks. Experiments on MNIST, CIFAR-10, and restricted ImageNet show that while transformations of attacks alone do not affect robustness, transformation-invariant attacks can improve model robustness by 2.5% on MNIST, 3.7% on CIFAR-10, and 1.1% on restricted ImageNet. We discuss the intuition behind this phenomenon.

الرؤية الحاسوبية وتمييز الأنماط