No Arabic abstract
Facial aging and facial rejuvenation analyze a given face photograph to predict a future look or estimate a past look of the person. To achieve this, it is critical to preserve human identity and the corresponding aging progression and regression with high accuracy. However, existing methods cannot simultaneously handle these two objectives well. We propose a novel generative adversarial network based approach, named the Conditional Multi-Adversarial AutoEncoder with Ordinal Regression (CMAAE-OR). It utilizes an age estimation technique to control the aging accuracy and takes a high-level feature representation to preserve personalized identity. Specifically, the face is first mapped to a latent vector through a convolutional encoder. The latent vector is then projected onto the face manifold conditional on the age through a deconvolutional generator. The latent vector preserves personalized face features and the age controls facial aging and rejuvenation. A discriminator and an ordinal regression are imposed on the encoder and the generator in tandem, making the generated face images to be more photorealistic while simultaneously exhibiting desirable aging effects. Besides, a high-level feature representation is utilized to preserve personalized identity of the generated face. Experiments on two benchmark datasets demonstrate appealing performance of the proposed method over the state-of-the-art.
Employing deep learning-based approaches for fine-grained facial expression analysis, such as those involving the estimation of Action Unit (AU) intensities, is difficult due to the lack of a large-scale dataset of real faces with sufficiently diverse AU labels for training. In this paper, we consider how AU-level facial image synthesis can be used to substantially augment such a dataset. We propose an AU synthesis framework that combines the well-known 3D Morphable Model (3DMM), which intrinsically disentangles expression parameters from other face attributes, with models that adversarially generate 3DMM expression parameters conditioned on given target AU labels, in contrast to the more conventional approach of generating facial images directly. In this way, we are able to synthesize new combinations of expression parameters and facial images from desired AU labels. Extensive quantitative and qualitative results on the benchmark DISFA dataset demonstrate the effectiveness of our method on 3DMM facial expression parameter synthesis and data augmentation for deep learning-based AU intensity estimation.
In this work, we propose a novel approach for generating videos of the six basic facial expressions given a neutral face image. We propose to exploit the face geometry by modeling the facial landmarks motion as curves encoded as points on a hypersphere. By proposing a conditional version of manifold-valued Wasserstein generative adversarial network (GAN) for motion generation on the hypersphere, we learn the distribution of facial expression dynamics of different classes, from which we synthesize new facial expression motions. The resulting motions can be transformed to sequences of landmarks and then to images sequences by editing the texture information using another conditional Generative Adversarial Network. To the best of our knowledge, this is the first work that explores manifold-valued representations with GAN to address the problem of dynamic facial expression generation. We evaluate our proposed approach both quantitatively and qualitatively on two public datasets; Oulu-CASIA and MUG Facial Expression. Our experimental results demonstrate the effectiveness of our approach in generating realistic videos with continuous motion, realistic appearance and identity preservation. We also show the efficiency of our framework for dynamic facial expressions generation, dynamic facial expression transfer and data augmentation for training improved emotion recognition models.
Image ordinal estimation is to predict the ordinal label of a given image, which can be categorized as an ordinal regression problem. Recent methods formulate an ordinal regression problem as a series of binary classification problems. Such methods cannot ensure that the global ordinal relationship is preserved since the relationships among different binary classifiers are neglected. We propose a novel ordinal regression approach, termed Convolutional Ordinal Regression Forest or CORF, for image ordinal estimation, which can integrate ordinal regression and differentiable decision trees with a convolutional neural network for obtaining precise and stable global ordinal relationships. The advantages of the proposed CORF are twofold. First, instead of learning a series of binary classifiers emph{independently}, the proposed method aims at learning an ordinal distribution for ordinal regression by optimizing those binary classifiers emph{simultaneously}. Second, the differentiable decision trees in the proposed CORF can be trained together with the ordinal distribution in an end-to-end manner. The effectiveness of the proposed CORF is verified on two image ordinal estimation tasks, i.e. facial age estimation and image aesthetic assessment, showing significant improvements and better stability over the state-of-the-art ordinal regression methods.
While unsupervised variational autoencoders (VAE) have become a powerful tool in neuroimage analysis, their application to supervised learning is under-explored. We aim to close this gap by proposing a unified probabilistic model for learning the latent space of imaging data and performing supervised regression. Based on recent advances in learning disentangled representations, the novel generative process explicitly models the conditional distribution of latent representations with respect to the regression target variable. Performing a variational inference procedure on this model leads to joint regularization between the VAE and a neural-network regressor. In predicting the age of 245 subjects from their structural Magnetic Resonance (MR) images, our model is more accurate than state-of-the-art methods when applied to either region-of-interest (ROI) measurements or raw 3D volume images. More importantly, unlike simple feed-forward neural-networks, disentanglement of age in latent representations allows for intuitive interpretation of the structural developmental patterns of the human brain.
Deep learning-based methods have achieved promising performance in early detection and classification of lung nodules, most of which discard unsure nodules and simply deal with a binary classification -- malignant vs benign. Recently, an unsure data model (UDM) was proposed to incorporate those unsure nodules by formulating this problem as an ordinal regression, showing better performance over traditional binary classification. To further explore the ordinal relationship for lung nodule classification, this paper proposes a meta ordinal regression forest (MORF), which improves upon the state-of-the-art ordinal regression method, deep ordinal regression forest (DORF), in three major ways. First, MORF can alleviate the biases of the predictions by making full use of deep features while DORF needs to fix the composition of decision trees before training. Second, MORF has a novel grouped feature selection (GFS) module to re-sample the split nodes of decision trees. Last, combined with GFS, MORF is equipped with a meta learning-based weighting scheme to map the features selected by GFS to tree-wise weights while DORF assigns equal weights for all trees. Experimental results on the LIDC-IDRI dataset demonstrate superior performance over existing methods, including the state-of-the-art DORF.