Recent Advances in Adversarial Training for Adversarial Robustness

290 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Tao Bai

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Tao Bai - Jinqi Luo - Jun Zhao

التعلم الآلي الذكاء الاصطناعي التشفير والأمن

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Adversarial training is one of the most effective approaches defending against adversarial examples for deep learning models. Unlike other defense strategies, adversarial training aims to promote the robustness of models intrinsically. During the last few years, adversarial training has been studied and discussed from various aspects. A variety of improvements and developments of adversarial training are proposed, which were, however, neglected in existing surveys. For the first time in this survey, we systematically review the recent progress on adversarial training for adversarial robustness with a novel taxonomy. Then we discuss the generalization problems in adversarial training from three perspectives. Finally, we highlight the challenges which are not fully tackled and present potential future directions.

قيم البحث

98 - Fengxiang He , Shaopeng Fu , Bohan Wang 2020

Adversarial training can considerably robustify deep neural networks to resist adversarial attacks. However, some works suggested that adversarial training might comprise the privacy-preserving and generalization abilities. This paper establishes and quantifies the privacy-robustness trade-off and generalization-robustness trade-off in adversarial training from both theoretical and empirical aspects. We first define a notion, {it robustified intensity} to measure the robustness of an adversarial training algorithm. This measure can be approximate empirically by an asymptotically consistent empirical estimator, {it empirical robustified intensity}. Based on the robustified intensity, we prove that (1) adversarial training is $(varepsilon, delta)$-differentially private, where the magnitude of the differential privacy has a positive correlation with the robustified intensity; and (2) the generalization error of adversarial training can be upper bounded by an $mathcal O(sqrt{log N}/N)$ on-average bound and an $mathcal O(1/sqrt{N})$ high-probability bound, both of which have positive correlations with the robustified intensity. Additionally, our generalization bounds do not explicitly rely on the parameter size which would be prohibitively large in deep learning. Systematic experiments on standard datasets, CIFAR-10 and CIFAR-100, are in full agreement with our theories. The source code package is available at url{https://github.com/fshp971/RPG}.

التعلم الآلي الذكاء الاصطناعي التشفير والأمن

Recent Advances in Understanding Adversarial Robustness of Deep Neural Networks

221 - Tao Bai , Jinqi Luo , Jun Zhao 2020

Adversarial examples are inevitable on the road of pervasive applications of deep neural networks (DNN). Imperceptible perturbations applied on natural samples can lead DNN-based classifiers to output wrong prediction with fair confidence score. It i s increasingly important to obtain models with high robustness that are resistant to adversarial examples. In this paper, we survey recent advances in how to understand such intriguing property, i.e. adversarial robustness, from different perspectives. We give preliminary definitions on what adversarial attacks and robustness are. After that, we study frequently-used benchmarks and mention theoretically-proved bounds for adversarial robustness. We then provide an overview on analyzing correlations among adversarial robustness and other critical indicators of DNN models. Lastly, we introduce recent arguments on potential costs of adversarial training which have attracted wide attention from the research community.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط

Training Meta-Surrogate Model for Transferable Adversarial Attack

210 - Yunxiao Qin , Yuanhao Xiong , Jinfeng Yi 2021

We consider adversarial attacks to a black-box model when no queries are allowed. In this setting, many methods directly attack surrogate models and transfer the obtained adversarial examples to fool the target model. Plenty of previous works investi gated what kind of attacks to the surrogate model can generate more transferable adversarial examples, but their performances are still limited due to the mismatches between surrogate models and the target model. In this paper, we tackle this problem from a novel angle -- instead of using the original surrogate models, can we obtain a Meta-Surrogate Model (MSM) such that attacks to this model can be easier transferred to other models? We show that this goal can be mathematically formulated as a well-posed (bi-level-like) optimization problem and design a differentiable attacker to make training feasible. Given one or a set of surrogate models, our method can thus obtain an MSM such that adversarial examples generated on MSM enjoy eximious transferability. Comprehensive experiments on Cifar-10 and ImageNet demonstrate that by attacking the MSM, we can obtain stronger transferable adversarial examples to fool black-box models including adversarially trained ones, with much higher success rates than existing methods. The proposed method reveals significant security challenges of deep models and is promising to be served as a state-of-the-art benchmark for evaluating the robustness of deep models in the black-box setting.

التعلم الآلي الذكاء الاصطناعي التشفير والأمن

Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives

96 - Jing Han , Zixing Zhang , Nicholas Cummins 2018

Over the past few years, adversarial training has become an extremely active research topic and has been successfully applied to various Artificial Intelligence (AI) domains. As a potentially crucial technique for the development of the next generati on of emotional AI systems, we herein provide a comprehensive overview of the application of adversarial training to affective computing and sentiment analysis. Various representative adversarial training algorithms are explained and discussed accordingly, aimed at tackling diverse challenges associated with emotional AI systems. Further, we highlight a range of potential future research directions. We expect that this overview will help facilitate the development of adversarial training for affective computing and sentiment analysis in both the academic and industrial communities.

الحساب واللغة الذكاء الاصطناعي تفاعل الإنسان والحاسوب

DeepCloak: Masking Deep Neural Network Models for Robustness Against Adversarial Samples

342 - Ji Gao , Beilun Wang , Zeming Lin 2017

Recent studies have shown that deep neural networks (DNN) are vulnerable to adversarial samples: maliciously-perturbed samples crafted to yield incorrect model outputs. Such attacks can severely undermine DNN systems, particularly in security-sensiti ve settings. It was observed that an adversary could easily generate adversarial samples by making a small perturbation on irrelevant feature dimensions that are unnecessary for the current classification task. To overcome this problem, we introduce a defensive mechanism called DeepCloak. By identifying and removing unnecessary features in a DNN model, DeepCloak limits the capacity an attacker can use generating adversarial samples and therefore increase the robustness against such inputs. Comparing with other defensive approaches, DeepCloak is easy to implement and computationally efficient. Experimental results show that DeepCloak can increase the performance of state-of-the-art DNN models against adversarial samples.

التعلم الآلي الذكاء الاصطناعي التشفير والأمن