ﻻ يوجد ملخص باللغة العربية
Adversarial examples are delicately perturbed inputs, which aim to mislead machine learning models towards incorrect outputs. While most of the existing work focuses on generating adversarial perturbations in multi-class classification problems, many real-world applications fall into the multi-label setting in which one instance could be associated with more than one label. For example, a spammer may generate adversarial spams with malicious advertising while maintaining the other labels such as topic labels unchanged. To analyze the vulnerability and robustness of multi-label learning models, we investigate the generation of multi-label adversarial perturbations. This is a challenging task due to the uncertain number of positive labels associated with one instance, as well as the fact that multiple labels are usually not mutually exclusive with each other. To bridge this gap, in this paper, we propose a general attacking framework targeting on multi-label classification problem and conduct a premier analysis on the perturbations for deep neural networks. Leveraging the ranking relationships among labels, we further design a ranking-based framework to attack multi-label ranking algorithms. We specify the connection between the two proposed frameworks and separately design two specific methods grounded on each of them to generate targeted multi-label perturbations. Experiments on real-world multi-label image classification and ranking problems demonstrate the effectiveness of our proposed frameworks and provide insights of the vulnerability of multi-label deep learning models under diverse targeted attacking strategies. Several interesting findings including an unpolished defensive strategy, which could potentially enhance the interpretability and robustness of multi-label deep learning models, are further presented and discussed at the end.
Adversarial machine learning has exposed several security hazards of neural models and has become an important research topic in recent times. Thus far, the concept of an adversarial perturbation has exclusively been used with reference to the input
Deep Neural Networks, despite their great success in diverse domains, are provably sensitive to small perturbations on correctly classified examples and lead to erroneous predictions. Recently, it was proposed that this behavior can be combatted by o
Deep neural networks are powerful and popular learning models that achieve state-of-the-art pattern recognition performance on many computer vision, speech, and language processing tasks. However, these networks have also been shown susceptible to ca
A fundamental question in adversarial machine learning is whether a robust classifier exists for a given task. A line of research has made progress towards this goal by studying concentration of measure, but without considering data labels. We argue
Despite being popularly used in many applications, neural network models have been found to be vulnerable to adversarial examples, i.e., carefully crafted examples aiming to mislead machine learning models. Adversarial examples can pose potential ris