Improving Adversarial Robustness via Unlabeled Out-of-Domain Data

85 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Zhun Deng

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Zhun Deng - Linjun Zhang - Amirata Ghorbani

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Data augmentation by incorporating cheap unlabeled data from multiple domains is a powerful way to improve prediction especially when there is limited labeled data. In this work, we investigate how adversarial robustness can be enhanced by leveraging out-of-domain unlabeled data. We demonstrate that for broad classes of distributions and classifiers, there exists a sample complexity gap between standard and robust classification. We quantify to what degree this gap can be bridged via leveraging unlabeled samples from a shifted domain by providing both upper and lower bounds. Moreover, we show settings where we achieve better adversarial robustness when the unlabeled data come from a shifted domain rather than the same domain as the labeled data. We also investigate how to leverage out-of-domain data when some structural information, such as sparsity, is shared between labeled and unlabeled domains. Experimentally, we augment two object recognition datasets (CIFAR-10 and SVHN) with easy to obtain and unlabeled out-of-domain data and demonstrate substantial improvement in the models robustness against $ell_infty$ adversarial attacks on the original domain.

قيم البحث

314 - Tianyu Pang , Kun Xu , Chao Du 2019

Though deep neural networks have achieved significant progress on various tasks, often enhanced by model ensemble, existing high-performance models can be vulnerable to adversarial attacks. Many efforts have been devoted to enhancing the robustness o f individual networks and then constructing a straightforward ensemble, e.g., by directly averaging the outputs, which ignores the interaction among networks. This paper presents a new method that explores the interaction among individual networks to improve robustness for ensemble models. Technically, we define a new notion of ensemble diversity in the adversarial setting as the diversity among non-maximal predictions of individual members, and present an adaptive diversity promoting (ADP) regularizer to encourage the diversity, which leads to globally better robustness for the ensemble by making adversarial examples difficult to transfer among individual members. Our method is computationally efficient and compatible with the defense methods acting on individual networks. Empirical results on various datasets verify that our method can improve adversarial robustness while maintaining state-of-the-art accuracy on normal examples.

التعلم الآلي التعلم الالي

Unlabeled Data Improves Adversarial Robustness

176 - Yair Carmon , Aditi Raghunathan , Ludwig Schmidt 2019

We demonstrate, theoretically and empirically, that adversarial robustness can significantly benefit from semisupervised learning. Theoretically, we revisit the simple Gaussian model of Schmidt et al. that shows a sample complexity gap between standa rd and robust classification. We prove that unlabeled data bridges this gap: a simple semisupervised learning procedure (self-training) achieves high robust accuracy using the same number of labels required for achieving high standard accuracy. Empirically, we augment CIFAR-10 with 500K unlabeled images sourced from 80 Million Tiny Images and use robust self-training to outperform state-of-the-art robust accuracies by over 5 points in (i) $ell_infty$ robustness against several strong attacks via adversarial training and (ii) certified $ell_2$ and $ell_infty$ robustness via randomized smoothing. On SVHN, adding the datasets own extra training set with the labels removed provides gains of 4 to 10 points, within 1 point of the gain from using the extra labels.

التعلم الالي الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Improving Adversarial Robustness via Guided Complement Entropy

91 - Hao-Yun Chen , Jhao-Hong Liang , Shih-Chieh Chang 2019

Adversarial robustness has emerged as an important topic in deep learning as carefully crafted attack samples can significantly disturb the performance of a model. Many recent methods have proposed to improve adversarial robustness by utilizing adver sarial training or model distillation, which adds additional procedures to model training. In this paper, we propose a new training paradigm called Guided Complement Entropy (GCE) that is capable of achieving adversarial defense for free, which involves no additional procedures in the process of improving adversarial robustness. In addition to maximizing model probabilities on the ground-truth class like cross-entropy, we neutralize its probabilities on the incorrect classes along with a guided term to balance between these two terms. We show in the experiments that our method achieves better model robustness with even better performance compared to the commonly used cross-entropy training objective. We also show that our method can be used orthogonal to adversarial training across well-known methods with noticeable robustness gain. To the best of our knowledge, our approach is the first one that improves model robustness without compromising performance.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط الأداء

Improving Adversarial Robustness via Channel-wise Activation Suppressing

169 - Yang Bai , Yuyuan Zeng , Yong Jiang 2021

The study of adversarial examples and their activation has attracted significant attention for secure and robust learning with deep neural networks (DNNs). Different from existing works, in this paper, we highlight two new characteristics of adversar ial examples from the channel-wise activation perspective: 1) the activation magnitudes of adversarial examples are higher than that of natural examples; and 2) the channels are activated more uniformly by adversarial examples than natural examples. We find that the state-of-the-art defense adversarial training has addressed the first issue of high activation magnitudes via training on adversarial examples, while the second issue of uniform activation remains. This motivates us to suppress redundant activation from being activated by adversarial perturbations via a Channel-wise Activation Suppressing (CAS) strategy. We show that CAS can train a model that inherently suppresses adversarial activation, and can be easily applied to existing defense methods to further improve their robustness. Our work provides a simple but generic training strategy for robustifying the intermediate layer activation of DNNs.

التعلم الآلي

Improving Uncertainty Estimates through the Relationship with Adversarial Robustness

69 - Yao Qin , Xuezhi Wang , Alex Beutel 2020

Robustness issues arise in a variety of forms and are studied through multiple lenses in the machine learning literature. Neural networks lack adversarial robustness -- they are vulnerable to adversarial examples that through small perturbations to i nputs cause incorrect predictions. Further, trust is undermined when models give miscalibrated or unstable uncertainty estimates, i.e. the predicted probability is not a good indicator of how much we should trust our model and could vary greatly over multiple independent runs. In this paper, we study the connection between adversarial robustness, predictive uncertainty (calibration) and model uncertainty (stability) on multiple classification networks and datasets. We find that the inputs for which the model is sensitive to small perturbations (are easily attacked) are more likely to have poorly calibrated and unstable predictions. Based on this insight, we examine if calibration and stability can be improved by addressing those adversarially unrobust inputs. To this end, we propose Adversarial Robustness based Adaptive Label Smoothing (AR-AdaLS) that integrates the correlations of adversarial robustness and uncertainty into training by adaptively softening labels conditioned on how easily it can be attacked by adversarial examples. We find that our method, taking the adversarial robustness of the in-distribution data into consideration, leads to better calibration and stability over the model even under distributional shifts. In addition, AR-AdaLS can also be applied to an ensemble model to achieve the best calibration performance.

التعلم الآلي التعلم الالي