ﻻ يوجد ملخص باللغة العربية
Deep learning has come a long way and has enjoyed an unprecedented success. Despite high accuracy, however, deep models are brittle and are easily fooled by imperceptible adversarial perturbations. In contrast to common inference-time attacks, Backdoor (aka Trojan) attacks target the training phase of model construction, and are extremely difficult to combat since a) the model behaves normally on a pristine testing set and b) the augmented perturbations can be minute and may only affect few training samples. Here, I propose a new method to tell whether a model has been subject to a backdoor attack. The idea is to generate adversarial examples, targeted or untargeted, using conventional attacks such as FGSM and then feed them back to the classifier. By computing the statistics (here simply mean maps) of the images in different categories and comparing them with the statistics of a reference model, it is possible to visually locate the perturbed regions and unveil the attack.
Triplet loss is an extremely common approach to distance metric learning. Representations of images from the same class are optimized to be mapped closer together in an embedding space than representations of images from different classes. Much work
We propose a novel approach for generating unrestricted adversarial examples by manipulating fine-grained aspects of image generation. Unlike existing unrestricted attacks that typically hand-craft geometric transformations, we learn stylistic and st
Adversarial examples of deep neural networks are receiving ever increasing attention because they help in understanding and reducing the sensitivity to their input. This is natural given the increasing applications of deep neural networks in our ever
We find that images contain intrinsic structure that enables the reversal of many adversarial attacks. Attack vectors cause not only image classifiers to fail, but also collaterally disrupt incidental structure in the image. We demonstrate that modif
Deep neural networks are vulnerable to adversarial examples, which can mislead classifiers by adding imperceptible perturbations. An intriguing property of adversarial examples is their good transferability, making black-box attacks feasible in real-