ﻻ يوجد ملخص باللغة العربية
Deep neural networks are vulnerable to adversarial examples, even in the black-box setting, where the attacker is restricted solely to query access. Existing black-box approaches to generating adversarial examples typically require a significant number of queries, either for training a substitute network or performing gradient estimation. We introduce GenAttack, a gradient-free optimization technique that uses genetic algorithms for synthesizing adversarial examples in the black-box setting. Our experiments on different datasets (MNIST, CIFAR-10, and ImageNet) show that GenAttack can successfully generate visually imperceptible adversarial examples against state-of-the-art image recognition models with orders of magnitude fewer queries than previous approaches. Against MNIST and CIFAR-10 models, GenAttack required roughly 2,126 and 2,568 times fewer queries respectively, than ZOO, the prior state-of-the-art black-box attack. In order to scale up the attack to large-scale high-dimensional ImageNet models, we perform a series of optimizations that further improve the query efficiency of our attack leading to 237 times fewer queries against the Inception-v3 model than ZOO. Furthermore, we show that GenAttack can successfully attack some state-of-the-art ImageNet defenses, including ensemble adversarial training and non-differentiable or randomized input transformations. Our results suggest that evolutionary algorithms open up a promising area of research into effective black-box attacks.
Black-box optimization (BBO) has a broad range of applications, including automatic machine learning, engineering, physics, and experimental design. However, it remains a challenge for users to apply BBO methods to their problems at hand with existin
We perform a comprehensive study on the performance of derivative free optimization (DFO) algorithms for the generation of targeted black-box adversarial attacks on Deep Neural Network (DNN) classifiers assuming the perturbation energy is bounded by
Deep neural networks (DNNs) are playing key roles in various artificial intelligence applications such as image classification and object recognition. However, a growing number of studies have shown that there exist adversarial examples in DNNs, whic
Black-box optimization is primarily important for many compute-intensive applications, including reinforcement learning (RL), robot control, etc. This paper presents a novel theoretical framework for black-box optimization, in which our method perfor
We study the problem of finding a universal (image-agnostic) perturbation to fool machine learning (ML) classifiers (e.g., neural nets, decision tress) in the hard-label black-box setting. Recent work in adversarial ML in the white-box setting (model