No Arabic abstract
Facial attributes (e.g., age and attractiveness) estimation performance has been greatly improved by using convolutional neural networks. However, existing methods have an inconsistency between the training objectives and the evaluation metric, so they may be suboptimal. In addition, these methods always adopt image classification or face recognition models with a large amount of parameters, which carry expensive computation cost and storage overhead. In this paper, we firstly analyze the essential relationship between two state-of-the-art methods (Ranking-CNN and DLDL) and show that the Ranking method is in fact learning label distribution implicitly. This result thus firstly unifies two existing popular state-of-the-art methods into the DLDL framework. Second, in order to alleviate the inconsistency and reduce resource consumption, we design a lightweight network architecture and propose a unified framework which can jointly learn facial attribute distribution and regress attribute value. The effectiveness of our approach has been demonstrated on both facial age and attractiveness estimation tasks. Our method achieves new state-of-the-art results using the single model with 36$times$(6$times$) fewer parameters and 2.6$times$(2.1$times$) faster inference speed on facial age (attractiveness) estimation. Moreover, our method can achieve comparable results as the state-of-the-art even though the number of parameters is further reduced to 0.9M (3.8MB disk storage).
Achieving high performance for facial age estimation with subjects in the borderline between adulthood and non-adulthood has always been a challenge. Several studies have used different approaches from the age of a baby to an elder adult and different datasets have been employed to measure the mean absolute error (MAE) ranging between 1.47 to 8 years. The weakness of the algorithms specifically in the borderline has been a motivation for this paper. In our approach, we have developed an ensemble technique that improves the accuracy of underage estimation in conjunction with our deep learning model (DS13K) that has been fine-tuned on the Deep Expectation (DEX) model. We have achieved an accuracy of 68% for the age group 16 to 17 years old, which is 4 times better than the DEX accuracy for such age range. We also present an evaluation of existing cloud-based and offline facial age prediction services, such as Amazon Rekognition, Microsoft Azure Cognitive Services, How-Old.net and DEX.
Human face aging is irreversible process causing changes in human face characteristics such us hair whitening, muscles drop and wrinkles. Due to the importance of human face aging in biometrics systems, age estimation became an attractive area for researchers. This paper presents a novel method to estimate the age from face images, using binarized statistical image features (BSIF) and local binary patterns (LBP)histograms as features performed by support vector regression (SVR) and kernel ridge regression (KRR). We applied our method on FG-NET and PAL datasets. Our proposed method has shown superiority to that of the state-of-the-art methods when using the whole PAL database.
We propose a generative framework based on generative adversarial network (GAN) to enhance facial attractiveness while preserving facial identity and high-fidelity. Given a portrait image as input, having applied gradient descent to recover a latent vector that this generative framework can use to synthesize an image resemble to the input image, beauty semantic editing manipulation on the corresponding recovered latent vector based on InterFaceGAN enables this framework to achieve facial image beautification. This paper compared our system with Beholder-GAN and our proposed result-enhanced version of Beholder-GAN. It turns out that our framework obtained state-of-art attractiveness enhancement results. The code is available at https://github.com/zoezhou1999/BeautifyBasedOnGAN.
Bone age assessment (BAA) is clinically important as it can be used to diagnose endocrine and metabolic disorders during child development. Existing deep learning based methods for classifying bone age use the global image as input, or exploit local information by annotating extra bounding boxes or key points. However, training with the global image underutilizes discriminative local information, while providing extra annotations is expensive and subjective. In this paper, we propose an attention-guided approach to automatically localize the discriminative regions for BAA without any extra annotations. Specifically, we first train a classification model to learn the attention maps of the discriminative regions, finding the hand region, the most discriminative region (the carpal bones), and the next most discriminative region (the metacarpal bones). Guided by those attention maps, we then crop the informative local regions from the original image and aggregate different regions for BAA. Instead of taking BAA as a general regression task, which is suboptimal due to the label ambiguity problem in the age label space, we propose using joint age distribution learning and expectation regression, which makes use of the ordinal relationship among hand images with different individual ages and leads to more robust age estimation. Extensive experiments are conducted on the RSNA pediatric bone age data set. Using no training annotations, our method achieves competitive results compared with existing state-of-the-art semi-automatic deep learning-based methods that require manual annotation. Code is available at https: //github.com/chenchao666/Bone-Age-Assessment.
Image-based age estimation aims to predict a persons age from facial images. It is used in a variety of real-world applications. Although end-to-end deep models have achieved impressive results for age estimation on benchmark datasets, their performance in-the-wild still leaves much room for improvement due to the challenges caused by large variations in head pose, facial expressions, and occlusions. To address this issue, we propose a simple yet effective method to explicitly incorporate facial semantics into age estimation, so that the model would learn to correctly focus on the most informative facial components from unaligned facial images regardless of head pose and non-rigid deformation. To this end, we design a face parsing-based network to learn semantic information at different scales and a novel face parsing attention module to leverage these semantic features for age estimation. To evaluate our method on in-the-wild data, we also introduce a new challenging large-scale benchmark called IMDB-Clean. This dataset is created by semi-automatically cleaning the noisy IMDB-WIKI dataset using a constrained clustering method. Through comprehensive experiment on IMDB-Clean and other benchmark datasets, under both intra-dataset and cross-dataset evaluation protocols, we show that our method consistently outperforms all existing age estimation methods and achieves a new state-of-the-art performance. To the best of our knowledge, our work presents the first attempt of leveraging face parsing attention to achieve semantic-aware age estimation, which may be inspiring to other high level facial analysis tasks.