Should Adversarial Attacks Use Pixel p-Norm?


Abstract in English

Adversarial attacks aim to confound machine learning systems, while remaining virtually imperceptible to humans. Attacks on image classification systems are typically gauged in terms of $p$-norm distortions in the pixel feature space. We perform a behavioral study, demonstrating that the pixel $p$-norm for any $0le p le infty$, and several alternative measures including earth movers distance, structural similarity index, and deep net embedding, do not fit human perception. Our result has the potential to improve the understanding of adversarial attack and defense strategies.

Download