Measuring $ell_infty$ Attacks by the $ell_2$ Norm


الملخص بالإنكليزية

Deep Neural Networks (DNNs) could be easily fooled by Adversarial Examples (AEs) with the imperceptible difference to original samples in human eyes. To keep the difference imperceptible, the existing attacking bound the adversarial perturbations by the $ell_infty$ norm, which is then served as the standard to align different attacks for a fair comparison. However, when investigating attack transferability, i.e., the capability of the AEs from attacking one surrogate DNN to cheat other black-box DNN, we find that only using the $ell_infty$ norm is not sufficient to measure the attack strength, according to our comprehensive experiments concerning 7 transfer-based attacks, 4 white-box surrogate models, and 9 black-box victim models. Specifically, we find that the $ell_2$ norm greatly affects the transferability in $ell_infty$ attacks. Since larger-perturbed AEs naturally bring about better transferability, we advocate that the strength of all attacks should be measured by both the widely used $ell_infty$ and also the $ell_2$ norm. Despite the intuitiveness of our conclusion and advocacy, they are very necessary for the community, because common evaluations (bounding only the $ell_infty$ norm) allow tricky enhancements of the attack transferability by increasing the attack strength ($ell_2$ norm) as shown by our simple counter-example method, and the good transferability of several existing methods may be due to their large $ell_2$ distances.

تحميل البحث