Query-Free Attacks on Industry-Grade Face Recognition Systems under Resource Constraints


الملخص بالإنكليزية

To launch black-box attacks against a Deep Neural Network (DNN) based Face Recognition (FR) system, one needs to build textit{substitute} models to simulate the target model, so the adversarial examples discovered from substitute models could also mislead the target model. Such textit{transferability} is achieved in recent studies through querying the target model to obtain data for training the substitute models. A real-world target, likes the FR system of law enforcement, however, is less accessible to the adversary. To attack such a system, a substitute model with similar quality as the target model is needed to identify their common defects. This is hard since the adversary often does not have the enough resources to train such a powerful model (hundreds of millions of images and rooms of GPUs are needed to train a commercial FR system). We found in our research, however, that a resource-constrained adversary could still effectively approximate the target models capability to recognize textit{specific} individuals, by training textit{biased} substitute models on additional images of those victims whose identities the attacker want to cover or impersonate. This is made possible by a new property we discovered, called textit{Nearly Local Linearity} (NLL), which models the observation that an ideal DNN model produces the image representations (embeddings) whose distances among themselves truthfully describe the human perception of the differences among the input images. By simulating this property around the victims images, we significantly improve the transferability of black-box impersonation attacks by nearly 50%. Particularly, we successfully attacked a commercial system trained over 20 million images, using 4 million images and 1/5 of the training time but achieving 62% transferability in an impersonation attack and 89% in a dodging attack.

تحميل البحث