Unpriortized Autoencoder For Image Generation

122 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Hojun Lee

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Jaeyoung Yoo - Hojun Lee - Nojun Kwak

قم بزيارة صفحتنا على فيسبوك

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In this paper, we treat the image generation task using an autoencoder, a representative latent model. Unlike many studies regularizing the latent variables distribution by assuming a manually specified prior, we approach the image generation task using an autoencoder by directly estimating the latent distribution. To this end, we introduce latent density estimator which captures latent distribution explicitly and propose its structure. Through experiments, we show that our generative model generates images with the improved visual quality compared to previous autoencoder-based generative models.

قيم البحث

92 - Mario Lucic , Michael Tschannen , Marvin Ritter 2019

Deep generative models are becoming a cornerstone of modern machine learning. Recent work on conditional generative adversarial networks has shown that learning complex, high-dimensional distributions over natural images is within reach. While the la test models are able to generate high-fidelity, diverse natural images at high resolution, they rely on a vast quantity of labeled data. In this work we demonstrate how one can benefit from recent work on self- and semi-supervised learning to outperform the state of the art on both unsupervised ImageNet synthesis, as well as in the conditional setting. In particular, the proposed approach is able to match the sample quality (as measured by FID) of the current state-of-the-art conditional model BigGAN on ImageNet using only 10% of the labels and outperform it using 20% of the labels.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي

Re-balancing Variational Autoencoder Loss for Molecule Sequence Generation

94 - Chaochao Yan , Sheng Wang , Jinyu Yang 2019

Molecule generation is to design new molecules with specific chemical properties and further to optimize the desired chemical properties. Following previous work, we encode molecules into continuous vectors in the latent space and then decode the vec tors into molecules under the variational autoencoder (VAE) framework. We investigate the posterior collapse problem of current RNN-based VAEs for molecule sequence generation. For the first time, we find that underestimated reconstruction loss leads to posterior collapse, and provide both theoretical and experimental evidence. We propose an effective and efficient solution to fix the problem and avoid posterior collapse. Without bells and whistles, our method achieves SOTA reconstruction accuracy and competitive validity on the ZINC 250K dataset. When generating 10,000 unique valid SMILES from random prior sampling, it costs JT-VAE1450s while our method only needs 9s. Our implementation is at https://github.com/chaoyan1037/Re-balanced-VAE.

التعلم الآلي الفيزياء الكيميائية التعلم الالي

Implicit Rank-Minimizing Autoencoder

83 - Li Jing , Jure Zbontar , Yann LeCun 2020

An important component of autoencoders is the method by which the information capacity of the latent representation is minimized or limited. In this work, the rank of the covariance matrix of the codes is implicitly minimized by relying on the fact t hat gradient descent learning in multi-layer linear networks leads to minimum-rank solutions. By inserting a number of extra linear layers between the encoder and the decoder, the system spontaneously learns representations with a low effective dimension. The model, dubbed Implicit Rank-Minimizing Autoencoder (IRMAE), is simple, deterministic, and learns compact latent spaces. We demonstrate the validity of the method on several image generation and representation learning tasks.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي

Encoded Prior Sliced Wasserstein AutoEncoder for learning latent manifold representations

189 - Sanjukta Krishnagopal , Jacob Bedrossian 2020

While variational autoencoders have been successful generative models for a variety of tasks, the use of conventional Gaussian or Gaussian mixture priors are limited in their ability to capture topological or geometric properties of data in the laten t representation. In this work, we introduce an Encoded Prior Sliced Wasserstein AutoEncoder (EPSWAE) wherein an additional prior-encoder network learns an unconstrained prior to match the encoded data manifold. The autoencoder and prior-encoder networks are iteratively trained using the Sliced Wasserstein Distance (SWD), which efficiently measures the distance between two $textit{arbitrary}$ sampleable distributions without being constrained to a specific form as in the KL divergence, and without requiring expensive adversarial training. Additionally, we enhance the conventional SWD by introducing a nonlinear shearing, i.e., averaging over random $textit{nonlinear}$ transformations, to better capture differences between two distributions. The prior is further encouraged to encode the data manifold by use of a structural consistency term that encourages isometry between feature space and latent space. Lastly, interpolation along $textit{geodesics}$ on the latent space representation of the data manifold generates samples that lie on the manifold and hence is advantageous compared with standard Euclidean interpolation. To this end, we introduce a graph-based algorithm for identifying network-geodesics in latent space from samples of the prior that maximize the density of samples along the path while minimizing total energy. We apply our framework to 3D-spiral, MNIST, and CelebA datasets, and show that its latent representations and interpolations are comparable to the state of the art on equivalent architectures.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي

Recent Advances in Autoencoder-Based Representation Learning

137 - Michael Tschannen , Olivier Bachem , Mario Lucic 2018

Learning useful representations with little or no supervision is a key challenge in artificial intelligence. We provide an in-depth review of recent advances in representation learning with a focus on autoencoder-based models. To organize these resul ts we make use of meta-priors believed useful for downstream tasks, such as disentanglement and hierarchical organization of features. In particular, we uncover three main mechanisms to enforce such properties, namely (i) regularizing the (approximate or aggregate) posterior distribution, (ii) factorizing the encoding and decoding distribution, or (iii) introducing a structured prior distribution. While there are some promising results, implicit or explicit supervision remains a key enabler and all current methods use strong inductive biases and modeling assumptions. Finally, we provide an analysis of autoencoder-based representation learning through the lens of rate-distortion theory and identify a clear tradeoff between the amount of prior knowledge available about the downstream tasks, and how useful the representation is for this task.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي