No Arabic abstract
The common implementation of face recognition systems as a cascade of a detection stage and a recognition or verification stage can cause problems beyond failures of the detector. When the detector succeeds, it can detect faces that cannot be recognized, no matter how capable the recognition system. Recognizability, a latent variable, should therefore be factored into the design and implementation of face recognition systems. We propose a measure of recognizability of a face image that leverages a key empirical observation: an embedding of face images, implemented by a deep neural network trained using mostly recognizable identities, induces a partition of the hypersphere whereby unrecognizable identities cluster together. This occurs regardless of the phenomenon that causes a face to be unrecognizable, it be optical or motion blur, partial occlusion, spatial quantization, poor illumination. Therefore, we use the distance from such an unrecognizable identity as a measure of recognizability, and incorporate it in the design of the over-all system. We show that accounting for recognizability reduces error rate of single-image face recognition by 58% at FAR=1e-5 on the IJB-C Covariate Verification benchmark, and reduces verification error rate by 24% at FAR=1e-5 in set-based recognition on the IJB-C benchmark.
While deep face recognition has benefited significantly from large-scale labeled data, current research is focused on leveraging unlabeled data to further boost performance, reducing the cost of human annotation. Prior work has mostly been in controlled settings, where the labeled and unlabeled data sets have no overlapping identities by construction. This is not realistic in large-scale face recognition, where one must contend with such overlaps, the frequency of which increases with the volume of data. Ignoring identity overlap leads to significant labeling noise, as data from the same identity is split into multiple clusters. To address this, we propose a novel identity separation method based on extreme value theory. It is formulated as an out-of-distribution detection algorithm, and greatly reduces the problems caused by overlapping-identity label noise. Considering cluster assignments as pseudo-labels, we must also overcome the labeling noise from clustering errors. We propose a modulation of the cosine loss, where the modulation weights correspond to an estimate of clustering uncertainty. Extensive experiments on both controlled and real settings demonstrate our methods consistent improvements over supervised baselines, e.g., 11.6% improvement on IJB-A verification.
Due to its convenience, biometric authentication, especial face authentication, has become increasingly mainstream and thus is now a prime target for attackers. Presentation attacks and face morphing are typical types of attack. Previous research has shown that finger-vein- and fingerprint-based authentication methods are susceptible to wolf attacks, in which a wolf sample matches many enrolled user templates. In this work, we demonstrated that wolf (generic) faces, which we call master faces, can also compromise face recognition systems and that the master face concept can be generalized in some cases. Motivated by recent similar work in the fingerprint domain, we generated high-quality master faces by using the state-of-the-art face generator StyleGAN in a process called latent variable evolution. Experiments demonstrated that even attackers with limited resources using only pre-trained models available on the Internet can initiate master face attacks. The results, in addition to demonstrating performance from the attackers point of view, can also be used to clarify and improve the performance of face recognition systems and harden face authentication systems.
Face recognition capabilities have recently made extraordinary leaps. Though this progress is at least partially due to ballooning training set sizes -- huge numbers of face images downloaded and labeled for identity -- it is not clear if the formidable task of collecting so many images is truly necessary. We propose a far more accessible means of increasing training data sizes for face recognition systems. Rather than manually harvesting and labeling more faces, we simply synthesize them. We describe novel methods of enriching an existing dataset with important facial appearance variations by manipulating the faces it contains. We further apply this synthesis approach when matching query images represented using a standard convolutional neural network. The effect of training and testing with synthesized images is extensively tested on the LFW and IJB-A (verification and identification) benchmarks and Janus CS2. The performances obtained by our approach match state of the art results reported by systems trained on millions of downloaded images.
Unveiling face images of a subject given his/her high-level representations extracted from a blackbox Face Recognition engine is extremely challenging. It is because the limitations of accessible information from that engine including its structure and uninterpretable extracted features. This paper presents a novel generative structure with Bijective Metric Learning, namely Bijective Generative Adversarial Networks in a Distillation framework (DiBiGAN), for synthesizing faces of an identity given that persons features. In order to effectively address this problem, this work firstly introduces a bijective metric so that the distance measurement and metric learning process can be directly adopted in image domain for an image reconstruction task. Secondly, a distillation process is introduced to maximize the information exploited from the blackbox face recognition engine. Then a Feature-Conditional Generator Structure with Exponential Weighting Strategy is presented for a more robust generator that can synthesize realistic faces with ID preservation. Results on several benchmarking datasets including CelebA, LFW, AgeDB, CFP-FP against matching engines have demonstrated the effectiveness of DiBiGAN on both image realism and ID preservation properties.
In recent year, tremendous strides have been made in face detection thanks to deep learning. However, most published face detectors deteriorate dramatically as the faces become smaller. In this paper, we present the Small Faces Attention (SFA) face detector to better detect faces with small scale. First, we propose a new scale-invariant face detection architecture which pays more attention to small faces, including 4-branch detection architecture and small faces sensitive anchor design. Second, feature maps fusion strategy is applied in SFA by partially combining high-level features into low-level features to further improve the ability of finding hard faces. Third, we use multi-scale training and testing strategy to enhance face detection performance in practice. Comprehensive experiments show that SFA significantly improves face detection performance, especially on small faces. Our real-time SFA face detector can run at 5 FPS on a single GPU as well as maintain high performance. Besides, our final SFA face detector achieves state-of-the-art detection performance on challenging face detection benchmarks, including WIDER FACE and FDDB datasets, with competitive runtime speed. Both our code and models will be available to the research community.