Do you want to publish a course? Click here

AgeFlow: Conditional Age Progression and Regression with Normalizing Flows

86   0   0.0 ( 0 )
 Added by Zhizhong Huang
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

Age progression and regression aim to synthesize photorealistic appearance of a given face image with aging and rejuvenation effects, respectively. Existing generative adversarial networks (GANs) based methods suffer from the following three major issues: 1) unstable training introducing strong ghost artifacts in the generated faces, 2) unpaired training leading to unexpected changes in facial attributes such as genders and races, and 3) non-bijective age mappings increasing the uncertainty in the face transformation. To overcome these issues, this paper proposes a novel framework, termed AgeFlow, to integrate the advantages of both flow-based models and GANs. The proposed AgeFlow contains three parts: an encoder that maps a given face to a latent space through an invertible neural network, a novel invertible conditional translation module (ICTM) that translates the source latent vector to target one, and a decoder that reconstructs the generated face from the target latent vector using the same encoder network; all parts are invertible achieving bijective age mappings. The novelties of ICTM are two-fold. First, we propose an attribute-aware knowledge distillation to learn the manipulation direction of age progression while keeping other unrelated attributes unchanged, alleviating unexpected changes in facial attributes. Second, we propose to use GANs in the latent space to ensure the learned latent vector indistinguishable from the real ones, which is much easier than traditional use of GANs in the image domain. Experimental results demonstrate superior performance over existing GANs-based methods on two benchmarked datasets. The source code is available at https://github.com/Hzzone/AgeFlow.



rate research

Read More

Although impressive results have been achieved for age progression and regression, there remain two major issues in generative adversarial networks (GANs)-based methods: 1) conditional GANs (cGANs)-based methods can learn various effects between any two age groups in a single model, but are insufficient to characterize some specific patterns due to completely shared convolutions filters; and 2) GANs-based methods can, by utilizing several models to learn effects independently, learn some specific patterns, however, they are cumbersome and require age label in advance. To address these deficiencies and have the best of both worlds, this paper introduces a dropout-like method based on GAN~(RoutingGAN) to route different effects in a high-level semantic feature space. Specifically, we first disentangle the age-invariant features from the input face, and then gradually add the effects to the features by residual routers that assign the convolution filters to different age groups by dropping out the outputs of others. As a result, the proposed RoutingGAN can simultaneously learn various effects in a single model, with convolution filters being shared in part to learn some specific effects. Experimental results on two benchmarked datasets demonstrate superior performance over existing methods both qualitatively and quantitatively.
Deep learning based image compression has recently witnessed exciting progress and in some cases even managed to surpass transform coding based approaches that have been established and refined over many decades. However, state-of-the-art solutions for deep image compression typically employ autoencoders which map the input to a lower dimensional latent space and thus irreversibly discard information already before quantization. Due to that, they inherently limit the range of quality levels that can be covered. In contrast, traditional approaches in image compression allow for a larger range of quality levels. Interestingly, they employ an invertible transformation before performing the quantization step which explicitly discards information. Inspired by this, we propose a deep image compression method that is able to go from low bit-rates to near lossless quality by leveraging normalizing flows to learn a bijective mapping from the image space to a latent representation. In addition to this, we demonstrate further advantages unique to our solution, such as the ability to maintain constant quality results through re-encoding, even when performed multiple times. To the best of our knowledge, this is the first work to explore the opportunities for leveraging normalizing flows for lossy image compression.
In inverse problems, we often have access to data consisting of paired samples $(x,y)sim p_{X,Y}(x,y)$ where $y$ are partial observations of a physical system, and $x$ represents the unknowns of the problem. Under these circumstances, we can employ supervised training to learn a solution $x$ and its uncertainty from the observations $y$. We refer to this problem as the supervised case. However, the data $ysim p_{Y}(y)$ collected at one point could be distributed differently than observations $ysim p_{Y}(y)$, relevant for a current set of problems. In the context of Bayesian inference, we propose a two-step scheme, which makes use of normalizing flows and joint data to train a conditional generator $q_{theta}(x|y)$ to approximate the target posterior density $p_{X|Y}(x|y)$. Additionally, this preliminary phase provides a density function $q_{theta}(x|y)$, which can be recast as a prior for the unsupervised problem, e.g.~when only the observations $ysim p_{Y}(y)$, a likelihood model $y|x$, and a prior on $x$ are known. We then train another invertible generator with output density $q_{phi}(x|y)$ specifically for $y$, allowing us to sample from the posterior $p_{X|Y}(x|y)$. We present some synthetic results that demonstrate considerable training speedup when reusing the pretrained network $q_{theta}(x|y)$ as a warm start or preconditioning for approximating $p_{X|Y}(x|y)$, instead of learning from scratch. This training modality can be interpreted as an instance of transfer learning. This result is particularly relevant for large-scale inverse problems that employ expensive numerical simulations.
Normalizing flows, which learn a distribution by transforming the data to samples from a Gaussian base distribution, have proven powerful density approximations. But their expressive power is limited by this choice of the base distribution. We, therefore, propose to generalize the base distribution to a more elaborate copula distribution to capture the properties of the target distribution more accurately. In a first empirical analysis, we demonstrate that this replacement can dramatically improve the vanilla normalizing flows in terms of flexibility, stability, and effectivity for heavy-tailed data. Our results suggest that the improvements are related to an increased local Lipschitz-stability of the learned flow.
Recent work has shown that Neural Ordinary Differential Equations (ODEs) can serve as generative models of images using the perspective of Continuous Normalizing Flows (CNFs). Such models offer exact likelihood calculation, and invertible generation/density estimation. In this work we introduce a Multi-Resolution variant of such models (MRCNF), by characterizing the conditional distribution over the additional information required to generate a fine image that is consistent with the coarse image. We introduce a transformation between resolutions that allows for no change in the log likelihood. We show that this approach yields comparable likelihood values for various image datasets, with improved performance at higher resolutions, with fewer parameters, using only 1 GPU. Further, we examine the out-of-distribution properties of (Multi-Resolution) Continuous Normalizing Flows, and find that they are similar to those of other likelihood-based generative models.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا