FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models

239 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ricky T. Q. Chen

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Will Grathwohl - Ricky T. Q. Chen - Jesse Bettencourt

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

A promising class of generative models maps points from a simple distribution to a complex distribution through an invertible neural network. Likelihood-based training of these models requires restricting their architectures to allow cheap computation of Jacobian determinants. Alternatively, the Jacobian trace can be used if the transformation is specified by an ordinary differential equation. In this paper, we use Hutchinsons trace estimator to give a scalable unbiased estimate of the log-density. The result is a continuous-time invertible generative model with unbiased density estimation and one-pass sampling, while allowing unrestricted neural network architectures. We demonstrate our approach on high-dimensional density estimation, image generation, and variational inference, achieving the state-of-the-art among exact likelihood methods with efficient sampling.

قيم البحث

292 - Fei Deng , Zhuo Zhi , Sungjin Ahn 2019

Compositional structures between parts and objects are inherent in natural scenes. Modeling such compositional hierarchies via unsupervised learning can bring various benefits such as interpretability and transferability, which are important in many downstream tasks. In this paper, we propose the first deep latent variable model, called RICH, for learning Representation of Interpretable Compositional Hierarchies. At the core of RICH is a latent scene graph representation that organizes the entities of a scene into a tree structure according to their compositional relationships. During inference, taking top-down approach, RICH is able to use higher-level representation to guide lower-level decomposition. This avoids the difficult problem of routing between parts and objects that is faced by bottom-up approaches. In experiments on images containing multiple objects with different part compositions, we demonstrate that RICH is able to learn the latent compositional hierarchy and generate imaginary scenes.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي

Scalable Transfer Learning with Expert Models

158 - Joan Puigcerver , Carlos Riquelme , Basil Mustafa 2020

Transfer of pre-trained representations can improve sample efficiency and reduce computational requirements for new tasks. However, representations used for transfer are usually generic, and are not tailored to a particular distribution of downstream tasks. We explore the use of expert representations for transfer with a simple, yet effective, strategy. We train a diverse set of experts by exploiting existing label structures, and use cheap-to-compute performance proxies to select the relevant expert for each target task. This strategy scales the process of transferring to new tasks, since it does not revisit the pre-training data during transfer. Accordingly, it requires little extra compute per target task, and results in a speed-up of 2-3 orders of magnitude compared to competing approaches. Further, we provide an adapter-based architecture able to compress many experts into a single model. We evaluate our approach on two different data sources and demonstrate that it outperforms baselines on over 20 diverse vision tasks in both cases.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي

Generative Models as Distributions of Functions

433 - Emilien Dupont , Yee Whye Teh , Arnaud Doucet 2021

Generative models are typically trained on grid-like data such as images. As a result, the size of these models usually scales directly with the underlying grid resolution. In this paper, we abandon discretized grids and instead parameterize individu al data points by continuous functions. We then build generative models by learning distributions over such functions. By treating data points as functions, we can abstract away from the specific type of data we train on and construct models that scale independently of signal resolution. To train our model, we use an adversarial approach with a discriminator that acts on continuous signals. Through experiments on both images and 3D shapes, we demonstrate that our model can learn rich distributions of functions independently of data type and resolution.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي

A theory of independent mechanisms for extrapolation in generative models

84 - Michel Besserve , Remy Sun , Dominik Janzing 2020

Deep generative models reproduce complex empirical data but cannot extrapolate to novel environments. An intuitive idea to promote extrapolation capabilities is to enforce the architecture to have the modular structure of a causal graphical model, wh ere one can intervene on each module independently of the others in the graph. We develop a framework to formalize this intuition, using the principle of Independent Causal Mechanisms, and show how over-parameterization of generative neural networks can hinder extrapolation capabilities. Our experiments on the generation of human faces shows successive layers of a generator architecture implement independent mechanisms to some extent, allowing meaningful extrapolations. Finally, we illustrate that independence of mechanisms may be enforced during training to improve extrapolation.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي

Hidden Convexity of Wasserstein GANs: Interpretable Generative Models with Closed-Form Solutions

53 - Arda Sahiner , Tolga Ergen , Batu Ozturkler 2021

Generative Adversarial Networks (GANs) are commonly used for modeling complex distributions of data. Both the generators and discriminators of GANs are often modeled by neural networks, posing a non-transparent optimization problem which is non-conve x and non-concave over the generator and discriminator, respectively. Such networks are often heuristically optimized with gradient descent-ascent (GDA), but it is unclear whether the optimization problem contains any saddle points, or whether heuristic methods can find them in practice. In this work, we analyze the training of Wasserstein GANs with two-layer neural network discriminators through the lens of convex duality, and for a variety of generators expose the conditions under which Wasserstein GANs can be solved exactly with convex optimization approaches, or can be represented as convex-concave games. Using this convex duality interpretation, we further demonstrate the impact of different activation functions of the discriminator. Our observations are verified with numerical results demonstrating the power of the convex interpretation, with applications in progressive training of convex architectures corresponding to linear generators and quadratic-activation discriminators for CelebA image generation. The code for our experiments is available at https://github.com/ardasahiner/ProCoGAN.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط معالجة الصور والفيديو