ترغب بنشر مسار تعليمي؟ اضغط هنا

Pasadena: Perceptually Aware and Stealthy Adversarial Denoise Attack

94   0   0.0 ( 0 )
 نشر من قبل Yupeng Cheng
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

Image denoising can remove natural noise that widely exists in images captured by multimedia devices due to low-quality imaging sensors, unstable image transmission processes, or low light conditions. Recent works also find that image denoising benefits the high-level vision tasks, e.g., image classification. In this work, we try to challenge this common sense and explore a totally new problem, i.e., whether the image denoising can be given the capability of fooling the state-of-the-art deep neural networks (DNNs) while enhancing the image quality. To this end, we initiate the very first attempt to study this problem from the perspective of adversarial attack and propose the adversarial denoise attack. More specifically, our main contributions are three-fold: First, we identify a new task that stealthily embeds attacks inside the image denoising module widely deployed in multimedia devices as an image post-processing operation to simultaneously enhance the visual image quality and fool DNNs. Second, we formulate this new task as a kernel prediction problem for image filtering and propose the adversarial-denoising kernel prediction that can produce adversarial-noiseless kernels for effective denoising and adversarial attacking simultaneously. Third, we implement an adaptive perceptual region localization to identify semantic-related vulnerability regions with which the attack can be more effective while not doing too much harm to the denoising. We name the proposed method as Pasadena (Perceptually Aware and Stealthy Adversarial DENoise Attack) and validate our method on the NeurIPS17 adversarial competition dataset, CVPR2021-AIC-VI: unrestricted adversarial attacks on ImageNet,etc. The comprehensive evaluation and analysis demonstrate that our method not only realizes denoising but also achieves a significantly higher success rate and transferability over state-of-the-art attacks.



قيم البحث

اقرأ أيضاً

Modern deep neural networks(DNNs) are vulnerable to adversarial samples. Sparse adversarial samples are a special branch of adversarial samples that can fool the target model by only perturbing a few pixels. The existence of the sparse adversarial at tack points out that DNNs are much more vulnerable than people believed, which is also a new aspect for analyzing DNNs. However, current sparse adversarial attack methods still have some shortcomings on both sparsity and invisibility. In this paper, we propose a novel two-stage distortion-aware greedy-based method dubbed as GreedyFool. Specifically, it first selects the most effective candidate positions to modify by considering both the gradient(for adversary) and the distortion map(for invisibility), then drops some less important points in the reduce stage. Experiments demonstrate that compared with the start-of-the-art method, we only need to modify $3times$ fewer pixels under the same sparse perturbation setting. For target attack, the success rate of our method is 9.96% higher than the start-of-the-art method under the same pixel budget. Code can be found at https://github.com/LightDXY/GreedyFool.
High-level representation-guided pixel denoising and adversarial training are independent solutions to enhance the robustness of CNNs against adversarial attacks by pre-processing input data and re-training models, respectively. Most recently, advers arial training techniques have been widely studied and improved while the pixel denoising-based method is getting less attractive. However, it is still questionable whether there exists a more advanced pixel denoising-based method and whether the combination of the two solutions benefits each other. To this end, we first comprehensively investigate two kinds of pixel denoising methods for adversarial robustness enhancement (i.e., existing additive-based and unexplored filtering-based methods) under the loss functions of image-level and semantic-level restorations, respectively, showing that pixel-wise filtering can obtain much higher image quality (e.g., higher PSNR) as well as higher robustness (e.g., higher accuracy on adversarial examples) than existing pixel-wise additive-based method. However, we also observe that the robustness results of the filtering-based method rely on the perturbation amplitude of adversarial examples used for training. To address this problem, we propose predictive perturbation-aware pixel-wise filtering, where dual-perturbation filtering and an uncertainty-aware fusion module are designed and employed to automatically perceive the perturbation amplitude during the training and testing process. The proposed method is termed as AdvFilter. Moreover, we combine adversarial pixel denoising methods with three adversarial training-based methods, hinting that considering data and models jointly is able to achieve more robust CNNs. The experiments conduct on NeurIPS-2017DEV, SVHN, and CIFAR10 datasets and show the advantages over enhancing CNNs robustness, high generalization to different models, and noise levels.
Adversarial attacks find perturbations that can fool models into misclassifying images. Previous works had successes in generating noisy/edge-rich adversarial perturbations, at the cost of degradation of image quality. Such perturbations, even when t hey are small in scale, are usually easily spottable by human vision. In contrast, we propose Harmonic Adversar- ial Attack Methods (HAAM), that generates edge-free perturbations by using harmonic functions. The property of edge-free guarantees that the generated adversarial images can still preserve visual quality, even when perturbations are of large magnitudes. Experiments also show that adversaries generated by HAAM often have higher rates of success when transferring between models. In addition, we find harmonic perturbations can simulate natural phenomena like natural lighting and shadows. It would then be possible to help find corner cases for given models, as a first step to improving them.
In recent years, adversarial attacks have drawn more attention for their value on evaluating and improving the robustness of machine learning models, especially, neural network models. However, previous attack methods have mainly focused on applying some $l^p$ norm-bounded noise perturbations. In this paper, we instead introduce a novel adversarial attack method based on haze, which is a common phenomenon in real-world scenery. Our method can synthesize potentially adversarial haze into an image based on the atmospheric scattering model with high realisticity and mislead classifiers to predict an incorrect class. We launch experiments on two popular datasets, i.e., ImageNet and NIPS~2017. We demonstrate that the proposed method achieves a high success rate, and holds better transferability across different classification models than the baselines. We also visualize the correlation matrices, which inspire us to jointly apply different perturbations to improve the success rate of the attack. We hope this work can boost the development of non-noise-based adversarial attacks and help evaluate and improve the robustness of DNNs.
In recent years, research on adversarial attacks has become a hot spot. Although current literature on the transfer-based adversarial attack has achieved promising results for improving the transferability to unseen black-box models, it still leaves a long way to go. Inspired by the idea of meta-learning, this paper proposes a novel architecture called Meta Gradient Adversarial Attack (MGAA), which is plug-and-play and can be integrated with any existing gradient-based attack method for improving the cross-model transferability. Specifically, we randomly sample multiple models from a model zoo to compose different tasks and iteratively simulate a white-box attack and a black-box attack in each task. By narrowing the gap between the gradient directions in white-box and black-box attacks, the transferability of adversarial examples on the black-box setting can be improved. Extensive experiments on the CIFAR10 and ImageNet datasets show that our architecture outperforms the state-of-the-art methods for both black-box and white-box attack settings.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا