TOG: Targeted Adversarial Objectness Gradient Attacks on Real-time Object Detection Systems

182 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ka-Ho Chow

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ka-Ho Chow - Ling Liu - Mehmet Emre Gursoy

التعلم الآلي التشفير والأمن الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The rapid growth of real-time huge data capturing has pushed the deep learning and data analytic computing to the edge systems. Real-time object recognition on the edge is one of the representative deep neural network (DNN) powered edge systems for real-world mission-critical applications, such as autonomous driving and augmented reality. While DNN powered object detection edge systems celebrate many life-enriching opportunities, they also open doors for misuse and abuse. This paper presents three Targeted adversarial Objectness Gradient attacks, coined as TOG, which can cause the state-of-the-art deep object detection networks to suffer from object-vanishing, object-fabrication, and object-mislabeling attacks. We also present a universal objectness gradient attack to use adversarial transferability for black-box attacks, which is effective on any inputs with negligible attack time cost, low human perceptibility, and particularly detrimental to object detection edge systems. We report our experimental measurements using two benchmark datasets (PASCAL VOC and MS COCO) on two state-of-the-art detection algorithms (YOLO and SSD). The results demonstrate serious adversarial vulnerabilities and the compelling need for developing robust object detection systems.

قيم البحث

89 - Tianyu Pang , Xiao Yang , Yinpeng Dong 2021

Collecting training data from untrusted sources exposes machine learning services to poisoning adversaries, who maliciously manipulate training data to degrade the model accuracy. When trained on offline datasets, poisoning adversaries have to inject the poisoned data in advance before training, and the order of feeding these poisoned batches into the model is stochastic. In contrast, practical systems are more usually trained/fine-tuned on sequentially captured real-time data, in which case poisoning adversaries could dynamically poison each data batch according to the current model state. In this paper, we focus on the real-time settings and propose a new attacking strategy, which affiliates an accumulative phase with poisoning attacks to secretly (i.e., without affecting accuracy) magnify the destructive effect of a (poisoned) trigger batch. By mimicking online learning and federated learning on CIFAR-10, we show that the model accuracy will significantly drop by a single update step on the trigger batch after the accumulative phase. Our work validates that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects, with no need to explore complex techniques.

التعلم الآلي التشفير والأمن الرؤية الحاسوبية وتمييز الأنماط

Boosting Gradient for White-Box Adversarial Attacks

91 - Hongying Liu , Zhenyu Zhou , Fanhua Shang 2020

Deep neural networks (DNNs) are playing key roles in various artificial intelligence applications such as image classification and object recognition. However, a growing number of studies have shown that there exist adversarial examples in DNNs, whic h are almost imperceptibly different from original samples, but can greatly change the network output. Existing white-box attack algorithms can generate powerful adversarial examples. Nevertheless, most of the algorithms concentrate on how to iteratively make the best use of gradients to improve adversarial performance. In contrast, in this paper, we focus on the properties of the widely-used ReLU activation function, and discover that there exist two phenomena (i.e., wrong blocking and over transmission) misleading the calculation of gradients in ReLU during the backpropagation. Both issues enlarge the difference between the predicted changes of the loss function from gradient and corresponding actual changes, and mislead the gradients which results in larger perturbations. Therefore, we propose a universal adversarial example generation method, called ADV-ReLU, to enhance the performance of gradient based white-box attack algorithms. During the backpropagation of the network, our approach calculates the gradient of the loss function versus network input, maps the values to scores, and selects a part of them to update the misleading gradients. Comprehensive experimental results on emph{ImageNet} demonstrate that our ADV-ReLU can be easily integrated into many state-of-the-art gradient-based white-box attack algorithms, as well as transferred to black-box attack attackers, to further decrease perturbations in the ${ell _2}$-norm.

التعلم الآلي التشفير والأمن الرؤية الحاسوبية وتمييز الأنماط

Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks

107 - Ali Shafahi , W. Ronny Huang , Mahyar Najibi 2018

Data poisoning is an attack on machine learning models wherein the attacker adds examples to the training set to manipulate the behavior of the model at test time. This paper explores poisoning attacks on neural nets. The proposed attacks use clean-l abels; they dont require the attacker to have any control over the labeling of training data. They are also targeted; they control the behavior of the classifier on a $textit{specific}$ test instance without degrading overall classifier performance. For example, an attacker could add a seemingly innocuous image (that is properly labeled) to a training set for a face recognition engine, and control the identity of a chosen person at test time. Because the attacker does not need to control the labeling function, poisons could be entered into the training set simply by leaving them on the web and waiting for them to be scraped by a data collection bot. We present an optimization-based method for crafting poisons, and show that just one single poison image can control classifier behavior when transfer learning is used. For full end-to-end training, we present a watermarking strategy that makes poisoning reliable using multiple ($approx$50) poisoned training instances. We demonstrate our method by generating poisoned frog images from the CIFAR dataset and using them to manipulate image classifiers.

التعلم الآلي التشفير والأمن الرؤية الحاسوبية وتمييز الأنماط

Stochastic sparse adversarial attacks

138 - Manon Cesaire , Hatem Hajri , Sylvain Lamprier 2020

This paper introduces stochastic sparse adversarial attacks (SSAA), simple, fast and purely noise-based targeted and untargeted $L_0$ attacks of neural network classifiers (NNC). SSAA are devised by exploiting a simple small-time expansion idea widel y used for Markov processes and offer new examples of $L_0$ attacks whose studies have been limited. They are designed to solve the known scalability issue of the family of Jacobian-based saliency maps attacks to large datasets and they succeed in solving it. Experiments on small and large datasets (CIFAR-10 and ImageNet) illustrate further advantages of SSAA in comparison with the-state-of-the-art methods. For instance, in the untargeted case, our method called Voting Folded Gaussian Attack (VFGA) scales efficiently to ImageNet and achieves a significantly lower $L_0$ score than SparseFool (up to $frac{2}{5}$ lower) while being faster. Moreover, VFGA achieves better $L_0$ scores on ImageNet than Sparse-RS when both attacks are fully successful on a large number of samples. Codes are publicly available through the link https://github.com/SSAA3/stochastic-sparse-adv-attacks

التعلم الآلي التشفير والأمن الرؤية الحاسوبية وتمييز الأنماط

Black-box Adversarial Attacks on Video Recognition Models

171 - Linxi Jiang , Xingjun Ma , Shaoxiang Chen 2019

Deep neural networks (DNNs) are known for their vulnerability to adversarial examples. These are examples that have undergone small, carefully crafted perturbations, and which can easily fool a DNN into making misclassifications at test time. Thus fa r, the field of adversarial research has mainly focused on image models, under either a white-box setting, where an adversary has full access to model parameters, or a black-box setting where an adversary can only query the target model for probabilities or labels. Whilst several white-box attacks have been proposed for video models, black-box video attacks are still unexplored. To close this gap, we propose the first black-box video attack framework, called V-BAD. V-BAD utilizes tentative perturbations transferred from image models, and partition-based rectifications found by the NES on partitions (patches) of tentative perturbations, to obtain good adversarial gradient estimates with fewer queries to the target model. V-BAD is equivalent to estimating the projection of an adversarial gradient on a selected subspace. Using three benchmark video datasets, we demonstrate that V-BAD can craft both untargeted and targeted attacks to fool two state-of-the-art deep video recognition models. For the targeted attack, it achieves $>$93% success rate using only an average of $3.4 sim 8.4 times 10^4$ queries, a similar number of queries to state-of-the-art black-box image attacks. This is despite the fact that videos often have two orders of magnitude higher dimensionality than static images. We believe that V-BAD is a promising new tool to evaluate and improve the robustness of video recognition models to black-box adversarial attacks.

التعلم الآلي التشفير والأمن الرؤية الحاسوبية وتمييز الأنماط