ﻻ يوجد ملخص باللغة العربية
This paper aims to explain adversarial attacks in terms of how adversarial perturbations contribute to the attacking task. We estimate attributions of different image regions to the decrease of the attacking cost based on the Shapley value. We define and quantify interactions among adversarial perturbation pixels, and decompose the entire perturbation map into relatively independent perturbation components. The decomposition of the perturbation map shows that adversarially-trained DNNs have more perturbation components in the foreground than normally-trained DNNs. Moreover, compared to the normally-trained DNN, the adversarially-trained DNN have more components which mainly decrease the score of the true category. Above analyses provide new insights into the understanding of adversarial attacks.
This paper aims to explain deep neural networks (DNNs) from the perspective of multivariate interactions. In this paper, we define and quantify the significance of interactions among multiple input variables of the DNN. Input variables with strong in
We target the problem of detecting Trojans or backdoors in DNNs. Such models behave normally with typical inputs but produce specific incorrect predictions for inputs poisoned with a Trojan trigger. Our approach is based on a novel observation that t
This paper introduces stochastic sparse adversarial attacks (SSAA), simple, fast and purely noise-based targeted and untargeted $L_0$ attacks of neural network classifiers (NNC). SSAA are devised by exploiting a simple small-time expansion idea widel
Robust optimization has been widely used in nowadays data science, especially in adversarial training. However, little research has been done to quantify how robust optimization changes the optimizers and the prediction losses comparing to standard t
Deep neural networks (DNNs) are playing key roles in various artificial intelligence applications such as image classification and object recognition. However, a growing number of studies have shown that there exist adversarial examples in DNNs, whic