Adversarial Risk and Robustness: General Definitions and Implications for the Uniform Distribution

73 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Dimitrios Diochnos

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Dimitrios I. Diochnos - Saeed Mahloujifar - Mohammad Mahmoody

التعلم الآلي التعقيد الحسابي التشفير والأمن

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We study adversarial perturbations when the instances are uniformly distributed over ${0,1}^n$. We study both inherent bounds that apply to any problem and any classifier for such a problem as well as bounds that apply to specific problems and specific hypothesis classes. As the current literature contains multiple definitions of adversarial risk and robustness, we start by giving a taxonomy for these definitions based on their goals, we identify one of them as the one guaranteeing misclassification by pushing the instances to the error region. We then study some classic algorithms for learning monotone conjunctions and compare their adversarial risk and robustness under different definitions by attacking the hypotheses using instances drawn from the uniform distribution. We observe that sometimes these definitions lead to significantly different bounds. Thus, this study advocates for the use of the error-region definition, even though other definitions, in other contexts, may coincide with the error-region definition. Using the error-region definition of adversarial perturbations, we then study inherent bounds on risk and robustness of any classifier for any classification problem whose instances are uniformly distributed over ${0,1}^n$. Using the isoperimetric inequality for the Boolean hypercube, we show that for initial error $0.01$, there always exists an adversarial perturbation that changes $O(sqrt{n})$ bits of the instances to increase the risk to $0.5$, making classifiers decisions meaningless. Furthermore, by also using the central limit theorem we show that when $nto infty$, at most $c cdot sqrt{n}$ bits of perturbations, for a universal constant $c< 1.17$, suffice for increasing the risk to $0.5$, and the same $c cdot sqrt{n} $ bits of perturbations on average suffice to increase the risk to $1$, hence bounding the robustness by $c cdot sqrt{n}$.

قيم البحث

289 - Tao Bai , Jinqi Luo , Jun Zhao 2021

Adversarial training is one of the most effective approaches defending against adversarial examples for deep learning models. Unlike other defense strategies, adversarial training aims to promote the robustness of models intrinsically. During the las t few years, adversarial training has been studied and discussed from various aspects. A variety of improvements and developments of adversarial training are proposed, which were, however, neglected in existing surveys. For the first time in this survey, we systematically review the recent progress on adversarial training for adversarial robustness with a novel taxonomy. Then we discuss the generalization problems in adversarial training from three perspectives. Finally, we highlight the challenges which are not fully tackled and present potential future directions.

التعلم الآلي الذكاء الاصطناعي التشفير والأمن

Robustness, Privacy, and Generalization of Adversarial Training

98 - Fengxiang He , Shaopeng Fu , Bohan Wang 2020

Adversarial training can considerably robustify deep neural networks to resist adversarial attacks. However, some works suggested that adversarial training might comprise the privacy-preserving and generalization abilities. This paper establishes and quantifies the privacy-robustness trade-off and generalization-robustness trade-off in adversarial training from both theoretical and empirical aspects. We first define a notion, {it robustified intensity} to measure the robustness of an adversarial training algorithm. This measure can be approximate empirically by an asymptotically consistent empirical estimator, {it empirical robustified intensity}. Based on the robustified intensity, we prove that (1) adversarial training is $(varepsilon, delta)$-differentially private, where the magnitude of the differential privacy has a positive correlation with the robustified intensity; and (2) the generalization error of adversarial training can be upper bounded by an $mathcal O(sqrt{log N}/N)$ on-average bound and an $mathcal O(1/sqrt{N})$ high-probability bound, both of which have positive correlations with the robustified intensity. Additionally, our generalization bounds do not explicitly rely on the parameter size which would be prohibitively large in deep learning. Systematic experiments on standard datasets, CIFAR-10 and CIFAR-100, are in full agreement with our theories. The source code package is available at url{https://github.com/fshp971/RPG}.

التعلم الآلي الذكاء الاصطناعي التشفير والأمن

Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure DNN Accelerators

80 - David Stutz , Nandhini Chandramoorthy , Matthias Hein 2021

Deep neural network (DNN) accelerators received considerable attention in recent years due to the potential to save energy compared to mainstream hardware. Low-voltage operation of DNN accelerators allows to further reduce energy consumption signific antly, however, causes bit-level failures in the memory storing the quantized DNN weights. Furthermore, DNN accelerators have been shown to be vulnerable to adversarial attacks on voltage controllers or individual bits. In this paper, we show that a combination of robust fixed-point quantization, weight clipping, as well as random bit error training (RandBET) or adversarial bit error training (AdvBET) improves robustness against random or adversarial bit errors in quantized DNN weights significantly. This leads not only to high energy savings for low-voltage operation as well as low-precision quantization, but also improves security of DNN accelerators. Our approach generalizes across operating voltages and accelerators, as demonstrated on bit errors from profiled SRAM arrays, and achieves robustness against both targeted and untargeted bit-level attacks. Without losing more than 0.8%/2% in test accuracy, we can reduce energy consumption on CIFAR10 by 20%/30% for 8/4-bit quantization using RandBET. Allowing up to 320 adversarial bit errors, AdvBET reduces test error from above 90% (chance level) to 26.22% on CIFAR10.

التعلم الآلي هندسة العتاد التشفير والأمن

2-in-1 Accelerator: Enabling Random Precision Switch for Winning Both Adversarial Robustness and Efficiency

118 - Yonggan Fu , Yang Zhao , Qixuan Yu 2021

The recent breakthroughs of deep neural networks (DNNs) and the advent of billions of Internet of Things (IoT) devices have excited an explosive demand for intelligent IoT devices equipped with domain-specific DNN accelerators. However, the deploymen t of DNN accelerator enabled intelligent functionality into real-world IoT devices still remains particularly challenging. First, powerful DNNs often come at prohibitive complexities, whereas IoT devices often suffer from stringent resource constraints. Second, while DNNs are vulnerable to adversarial attacks especially on IoT devices exposed to complex real-world environments, many IoT applications require strict security. Existing DNN accelerators mostly tackle only one of the two aforementioned challenges (i.e., efficiency or adversarial robustness) while neglecting or even sacrificing the other. To this end, we propose a 2-in-1 Accelerator, an integrated algorithm-accelerator co-design framework aiming at winning both the adversarial robustness and efficiency of DNN accelerators. Specifically, we first propose a Random Precision Switch (RPS) algorithm that can effectively defend DNNs against adversarial attacks by enabling random DNN quantization as an in-situ model switch. Furthermore, we propose a new precision-scalable accelerator featuring (1) a new precision-scalable MAC unit architecture which spatially tiles the temporal MAC units to boost both the achievable efficiency and flexibility and (2) a systematically optimized dataflow that is searched by our generic accelerator optimizer. Extensive experiments and ablation studies validate that our 2-in-1 Accelerator can not only aggressively boost both the adversarial robustness and efficiency of DNN accelerators under various attacks, but also naturally support instantaneous robustness-efficiency trade-offs adapting to varied resources without the necessity of DNN retraining.

التعلم الآلي هندسة العتاد التشفير والأمن

The Effects of Image Distribution and Task on Adversarial Robustness

112 - Owen Kunhardt , Arturo Deza , Tomaso Poggio 2021

In this paper, we propose an adaptation to the area under the curve (AUC) metric to measure the adversarial robustness of a model over a particular $epsilon$-interval $[epsilon_0, epsilon_1]$ (interval of adversarial perturbation strengths) that faci litates unbiased comparisons across models when they have different initial $epsilon_0$ performance. This can be used to determine how adversarially robust a model is to different image distributions or task (or some other variable); and/or to measure how robust a model is comparatively to other models. We used this adversarial robustness metric on models of an MNIST, CIFAR-10, and a Fusion dataset (CIFAR-10 + MNIST) where trained models performed either a digit or object recognition task using a LeNet, ResNet50, or a fully connected network (FullyConnectedNet) architecture and found the following: 1) CIFAR-10 models are inherently less adversarially robust than MNIST models; 2) Both the image distribution and task that a model is trained on can affect the adversarial robustness of the resultant model. 3) Pretraining with a different image distribution and task sometimes carries over the adversarial robustness induced by that image distribution and task in the resultant model; Collectively, our results imply non-trivial differences of the learned representation space of one perceptual system over another given its exposure to different image statistics or tasks (mainly objects vs digits). Moreover, these results hold even when model systems are equalized to have the same level of performance, or when exposed to approximately matched image statistics of fusion images but with different tasks.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط