Regularized linear autoencoders recover the principal components, eventually

106 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Xuchan Bao

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Xuchan Bao - James Lucas - Sushant Sachdeva

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Our understanding of learning input-output relationships with neural nets has improved rapidly in recent years, but little is known about the convergence of the underlying representations, even in the simple case of linear autoencoders (LAEs). We show that when trained with proper regularization, LAEs can directly learn the optimal representation -- ordered, axis-aligned principal components. We analyze two such regularization schemes: non-uniform $ell_2$ regularization and a deterministic variant of nested dropout [Rippel et al, ICML 2014]. Though both regularization schemes converge to the optimal representation, we show that this convergence is slow due to ill-conditioning that worsens with increasing latent dimension. We show that the inefficiency of learning the optimal representation is not inevitable -- we present a simple modification to the gradient descent update that greatly speeds up convergence empirically.

قيم البحث

75 - Abhishek Kumar , Ben Poole , Kevin Murphy 2020

Invertible flow-based generative models are an effective method for learning to generate samples, while allowing for tractable likelihood computation and inference. However, the invertibility requirement restricts models to have the same latent dimen sionality as the inputs. This imposes significant architectural, memory, and computational costs, making them more challenging to scale than other classes of generative models such as Variational Autoencoders (VAEs). We propose a generative model based on probability flows that does away with the bijectivity requirement on the model and only assumes injectivity. This also provides another perspective on regularized autoencoders (RAEs), with our final objectives resembling RAEs with specific regularizers that are derived by lower bounding the probability flow objective. We empirically demonstrate the promise of the proposed model, improving over VAEs and AEs in terms of sample quality.

التعلم الآلي التعلم الالي

Efficient Estimation of Linear Functionals of Principal Components

128 - Vladimir Koltchinskii , Matthias Loffler , Richard Nickl 2017

We study principal component analysis (PCA) for mean zero i.i.d. Gaussian observations $X_1,dots, X_n$ in a separable Hilbert space $mathbb{H}$ with unknown covariance operator $Sigma.$ The complexity of the problem is characterized by its effective rank ${bf r}(Sigma):= frac{{rm tr}(Sigma)}{|Sigma|},$ where ${rm tr}(Sigma)$ denotes the trace of $Sigma$ and $|Sigma|$ denotes its operator norm. We develop a method of bias reduction in the problem of estimation of linear functionals of eigenvectors of $Sigma.$ Under the assumption that ${bf r}(Sigma)=o(n),$ we establish the asymptotic normality and asymptotic properties of the risk of the resulting estimators and prove matching minimax lower bounds, showing their semi-parametric optimality.

نظرية الإحصاء نظرية الإحصاء

Isometric Autoencoders

131 - Amos Gropp , Matan Atzmon , Yaron Lipman 2020

High dimensional data is often assumed to be concentrated on or near a low-dimensional manifold. Autoencoders (AE) is a popular technique to learn representations of such data by pushing it through a neural network with a low dimension bottleneck whi le minimizing a reconstruction error. Using high capacity AE often leads to a large collection of minimizers, many of which represent a low dimensional manifold that fits the data well but generalizes poorly. Two sources of bad generalization are: extrinsic, where the learned manifold possesses extraneous parts that are far from the data; and intrinsic, where the encoder and decoder introduce arbitrary distortion in the low dimensional parameterization. An approach taken to alleviate these issues is to add a regularizer that favors a particular solution; common regularizers promote sparsity, small derivatives, or robustness to noise. In this paper, we advocate an isometry (i.e., local distance preserving) regularizer. Specifically, our regularizer encourages: (i) the decoder to be an isometry; and (ii) the encoder to be the decoders pseudo-inverse, that is, the encoder extends the inverse of the decoder to the ambient space by orthogonal projection. In a nutshell, (i) and (ii) fix both intrinsic and extrinsic degrees of freedom and provide a non-linear generalization to principal component analysis (PCA). Experimenting with the isometry regularizer on dimensionality reduction tasks produces useful low-dimensional data representations.

التعلم الآلي التعلم الالي

Diffusion Variational Autoencoders

287 - Luis A. Perez Rey , Vlado Menkovski , Jacobus W. Portegies 2019

A standard Variational Autoencoder, with a Euclidean latent space, is structurally incapable of capturing topological properties of certain datasets. To remove topological obstructions, we introduce Diffusion Variational Autoencoders with arbitrary m anifolds as a latent space. A Diffusion Variational Autoencoder uses transition kernels of Brownian motion on the manifold. In particular, it uses properties of the Brownian motion to implement the reparametrization trick and fast approximations to the KL divergence. We show that the Diffusion Variational Autoencoder is capable of capturing topological properties of synthetic datasets. Additionally, we train MNIST on spheres, tori, projective spaces, SO(3), and a torus embedded in R3. Although a natural dataset like MNIST does not have latent variables with a clear-cut topological structure, training it on a manifold can still highlight topological and geometrical properties.

التعلم الآلي التعلم الالي

Perceptual Generative Autoencoders

206 - Zijun Zhang , Ruixiang Zhang , Zongpeng Li 2019

Modern generative models are usually designed to match target distributions directly in the data space, where the intrinsic dimension of data can be much lower than the ambient dimension. We argue that this discrepancy may contribute to the difficult ies in training generative models. We therefore propose to map both the generated and target distributions to a latent space using the encoder of a standard autoencoder, and train the generator (or decoder) to match the target distribution in the latent space. Specifically, we enforce the consistency in both the data space and the latent space with theoretically justified data and latent reconstruction losses. The resulting generative model, which we call a perceptual generative autoencoder (PGA), is then trained with a maximum likelihood or variational autoencoder (VAE) objective. With maximum likelihood, PGAs generalize the idea of reversible generative models to unrestricted neural network architectures and arbitrary number of latent dimensions. When combined with VAEs, PGAs substantially improve over the baseline VAEs in terms of sample quality. Compared to other autoencoder-based generative models using simple priors, PGAs achieve state-of-the-art FID scores on CIFAR-10 and CelebA.

التعلم الآلي التعلم الالي