Defending Adversarial Examples via DNN Bottleneck Reinforcement

110 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Miaojing Shi

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Wenqing Liu - Miaojing Shi - Teddy Furon

الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper presents a DNN bottleneck reinforcement scheme to alleviate the vulnerability of Deep Neural Networks (DNN) against adversarial attacks. Typical DNN classifiers encode the input image into a compressed latent representation more suitable for inference. This information bottleneck makes a trade-off between the image-specific structure and class-specific information in an image. By reinforcing the former while maintaining the latter, any redundant information, be it adversarial or not, should be removed from the latent representation. Hence, this paper proposes to jointly train an auto-encoder (AE) sharing the same encoding weights with the visual classifier. In order to reinforce the information bottleneck, we introduce the multi-scale low-pass objective and multi-scale high-frequency communication for better frequency steering in the network. Unlike existing approaches, our scheme is the first reforming defense per se which keeps the classifier structure untouched without appending any pre-processing head and is trained with clean images only. Extensive experiments on MNIST, CIFAR-10 and ImageNet demonstrate the strong defense of our method against various adversarial attacks.

قيم البحث

83 - Dawei Zhou , Tongliang Liu , Bo Han 2021

Deep neural networks (DNNs) are vulnerable to adversarial noise. Their adversarial robustness can be improved by exploiting adversarial examples. However, given the continuously evolving attacks, models trained on seen types of adversarial examples g enerally cannot generalize well to unseen types of adversarial examples. To solve this problem, in this paper, we propose to remove adversarial noise by learning generalizable invariant features across attacks which maintain semantic classification information. Specifically, we introduce an adversarial feature learning mechanism to disentangle invariant features from adversarial noise. A normalization term has been proposed in the encoded space of the attack-invariant features to address the bias issue between the seen and unseen types of attacks. Empirical evaluations demonstrate that our method could provide better protection in comparison to previous state-of-the-art approaches, especially against unseen types of attacks and adaptive attacks.

الرؤية الحاسوبية وتمييز الأنماط

HyperNetworks with statistical filtering for defending adversarial examples

63 - Zhun Sun , Mete Ozay , Takayuki Okatani 2017

Deep learning algorithms have been known to be vulnerable to adversarial perturbations in various tasks such as image classification. This problem was addressed by employing several defense methods for detection and rejection of particular types of a ttacks. However, training and manipulating networks according to particular defense schemes increases computational complexity of the learning algorithms. In this work, we propose a simple yet effective method to improve robustness of convolutional neural networks (CNNs) to adversarial attacks by using data dependent adaptive convolution kernels. To this end, we propose a new type of HyperNetwork in order to employ statistical properties of input data and features for computation of statistical adaptive maps. Then, we filter convolution weights of CNNs with the learned statistical maps to compute dynamic kernels. Thereby, weights and kernels are collectively optimized for learning of image classification models robust to adversarial attacks without employment of additional target detection and rejection algorithms. We empirically demonstrate that the proposed method enables CNNs to spontaneously defend against different types of attacks, e.g. attacks generated by Gaussian noise, fast gradient sign methods (Goodfellow et al., 2014) and a black-box attack(Narodytska & Kasiviswanathan, 2016).

الرؤية الحاسوبية وتمييز الأنماط

ActiveGuard: An Active DNN IP Protection Technique via Adversarial Examples

226 - Mingfu Xue , Shichang Sun , Can He 2021

The training of Deep Neural Networks (DNN) is costly, thus DNN can be considered as the intellectual properties (IP) of model owners. To date, most of the existing protection works focus on verifying the ownership after the DNN model is stolen, which cannot resist piracy in advance. To this end, we propose an active DNN IP protection method based on adversarial examples against DNN piracy, named ActiveGuard. ActiveGuard aims to achieve authorization control and users fingerprints management through adversarial examples, and can provide ownership verification. Specifically, ActiveGuard exploits the elaborate adversarial examples as users fingerprints to distinguish authorized users from unauthorized users. Legitimate users can enter fingerprints into DNN for identity authentication and authorized usage, while unauthorized users will obtain poor model performance due to an additional control layer. In addition, ActiveGuard enables the model owner to embed a watermark into the weights of DNN. When the DNN is illegally pirated, the model owner can extract the embedded watermark and perform ownership verification. Experimental results show that, for authorized users, the test accuracy of LeNet-5 and Wide Residual Network (WRN) models are 99.15% and 91.46%, respectively, while for unauthorized users, the test accuracy of the two DNNs are only 8.92% (LeNet-5) and 10% (WRN), respectively. Besides, each authorized user can pass the fingerprint authentication with a high success rate (up to 100%). For ownership verification, the embedded watermark can be successfully extracted, while the normal performance of the DNN model will not be affected. Further, ActiveGuard is demonstrated to be robust against fingerprint forgery attack, model fine-tuning attack and pruning attack.

التشفير والأمن التعلم الآلي

Stochastic Combinatorial Ensembles for Defending Against Adversarial Examples

251 - George A. Adam , Petr Smirnov , David Duvenaud 2018

Many deep learning algorithms can be easily fooled with simple adversarial examples. To address the limitations of existing defenses, we devised a probabilistic framework that can generate an exponentially large ensemble of models from a single model with just a linear cost. This framework takes advantage of neural network depth and stochastically decides whether or not to insert noise removal operators such as VAEs between layers. We show empirically the important role that model gradients have when it comes to determining transferability of adversarial examples, and take advantage of this result to demonstrate that it is possible to train models with limited adversarial attack transferability. Additionally, we propose a detection method based on metric learning in order to detect adversarial examples that have no hope of being cleaned of maliciously engineered noise.

التعلم الآلي التشفير والأمن التعلم الالي

Towards Defending Multiple Adversarial Perturbations via Gated Batch Normalization

91 - Aishan Liu , Shiyu Tang , Xianglong Liu 2020

There is now extensive evidence demonstrating that deep neural networks are vulnerable to adversarial examples, motivating the development of defenses against adversarial attacks. However, existing adversarial defenses typically improve model robustn ess against individual specific perturbation types. Some recent methods improve model robustness against adversarial attacks in multiple $ell_p$ balls, but their performance against each perturbation type is still far from satisfactory. To better understand this phenomenon, we propose the emph{multi-domain} hypothesis, stating that different types of adversarial perturbations are drawn from different domains. Guided by the multi-domain hypothesis, we propose emph{Gated Batch Normalization (GBN)}, a novel building block for deep neural networks that improves robustness against multiple perturbation types. GBN consists of a gated sub-network and a multi-branch batch normalization (BN) layer, where the gated sub-network separates different perturbation types, and each BN branch is in charge of a single perturbation type and learns domain-specific statistics for input transformation. Then, features from different branches are aligned as domain-invariant representations for the subsequent layers. We perform extensive evaluations of our approach on MNIST, CIFAR-10, and Tiny-ImageNet, and demonstrate that GBN outperforms previous defense proposals against multiple perturbation types, i.e, $ell_1$, $ell_2$, and $ell_{infty}$ perturbations, by large margins of 10-20%.

الرؤية الحاسوبية وتمييز الأنماط

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الرشيد الدولية الخاصة للعلوم والتكنولوجيا

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Defending Adversarial Examples via DNN Bottleneck Reinforcement

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً