Do you want to publish a course? Click here

Regularized Autoencoders via Relaxed Injective Probability Flow

76   0   0.0 ( 0 )
 Added by Abhishek Kumar
 Publication date 2020
and research's language is English




Ask ChatGPT about the research

Invertible flow-based generative models are an effective method for learning to generate samples, while allowing for tractable likelihood computation and inference. However, the invertibility requirement restricts models to have the same latent dimensionality as the inputs. This imposes significant architectural, memory, and computational costs, making them more challenging to scale than other classes of generative models such as Variational Autoencoders (VAEs). We propose a generative model based on probability flows that does away with the bijectivity requirement on the model and only assumes injectivity. This also provides another perspective on regularized autoencoders (RAEs), with our final objectives resembling RAEs with specific regularizers that are derived by lower bounding the probability flow objective. We empirically demonstrate the promise of the proposed model, improving over VAEs and AEs in terms of sample quality.



rate research

Read More

Our understanding of learning input-output relationships with neural nets has improved rapidly in recent years, but little is known about the convergence of the underlying representations, even in the simple case of linear autoencoders (LAEs). We show that when trained with proper regularization, LAEs can directly learn the optimal representation -- ordered, axis-aligned principal components. We analyze two such regularization schemes: non-uniform $ell_2$ regularization and a deterministic variant of nested dropout [Rippel et al, ICML 2014]. Though both regularization schemes converge to the optimal representation, we show that this convergence is slow due to ill-conditioning that worsens with increasing latent dimension. We show that the inefficiency of learning the optimal representation is not inevitable -- we present a simple modification to the gradient descent update that greatly speeds up convergence empirically.
Injectivity plays an important role in generative models where it enables inference; in inverse problems and compressed sensing with generative priors it is a precursor to well posedness. We establish sharp characterizations of injectivity of fully-connected and convolutional ReLU layers and networks. First, through a layerwise analysis, we show that an expansivity factor of two is necessary and sufficient for injectivity by constructing appropriate weight matrices. We show that global injectivity with iid Gaussian matrices, a commonly used tractable model, requires larger expansivity between 3.4 and 10.5. We also characterize the stability of inverting an injective network via worst-case Lipschitz constants of the inverse. We then use arguments from differential topology to study injectivity of deep networks and prove that any Lipschitz map can be approximated by an injective ReLU network. Finally, using an argument based on random projections, we show that an end-to-end -- rather than layerwise -- doubling of the dimension suffices for injectivity. Our results establish a theoretical basis for the study of nonlinear inverse and inference problems using neural networks.
Domain adaptation aims to exploit the knowledge in source domain to promote the learning tasks in target domain, which plays a critical role in real-world applications. Recently, lots of deep learning approaches based on autoencoders have achieved a significance performance in domain adaptation. However, most existing methods focus on minimizing the distribution divergence by putting the source and target data together to learn global feature representations, while they do not consider the local relationship between instances in the same category from different domains. To address this problem, we propose a novel Semi-Supervised Representation Learning framework via Dual Autoencoders for domain adaptation, named SSRLDA. More specifically, we extract richer feature representations by learning the global and local feature representations simultaneously using two novel autoencoders, which are referred to as marginalized denoising autoencoder with adaptation distribution (MDAad) and multi-class marginalized denoising autoencoder (MMDA) respectively. Meanwhile, we make full use of label information to optimize feature representations. Experimental results show that our proposed approach outperforms several state-of-the-art baseline methods.
High dimensional data is often assumed to be concentrated on or near a low-dimensional manifold. Autoencoders (AE) is a popular technique to learn representations of such data by pushing it through a neural network with a low dimension bottleneck while minimizing a reconstruction error. Using high capacity AE often leads to a large collection of minimizers, many of which represent a low dimensional manifold that fits the data well but generalizes poorly. Two sources of bad generalization are: extrinsic, where the learned manifold possesses extraneous parts that are far from the data; and intrinsic, where the encoder and decoder introduce arbitrary distortion in the low dimensional parameterization. An approach taken to alleviate these issues is to add a regularizer that favors a particular solution; common regularizers promote sparsity, small derivatives, or robustness to noise. In this paper, we advocate an isometry (i.e., local distance preserving) regularizer. Specifically, our regularizer encourages: (i) the decoder to be an isometry; and (ii) the encoder to be the decoders pseudo-inverse, that is, the encoder extends the inverse of the decoder to the ambient space by orthogonal projection. In a nutshell, (i) and (ii) fix both intrinsic and extrinsic degrees of freedom and provide a non-linear generalization to principal component analysis (PCA). Experimenting with the isometry regularizer on dimensionality reduction tasks produces useful low-dimensional data representations.
69 - Yan Min , Mao Ye , Liang Tian 2020
Feature selection is a widely used dimension reduction technique to select feature subsets because of its interpretability. Many methods have been proposed and achieved good results, in which the relationships between adjacent data points are mainly concerned. But the possible associations between data pairs that are may not adjacent are always neglected. Different from previous methods, we propose a novel and very simple approach for unsupervised feature selection, named MMFS (Multi-step Markov transition probability for Feature Selection). The idea is using multi-step Markov transition probability to describe the relation between any data pair. Two ways from the positive and negative viewpoints are employed respectively to keep the data structure after feature selection. From the positive viewpoint, the maximum transition probability that can be reached in a certain number of steps is used to describe the relation between two points. Then, the features which can keep the compact data structure are selected. From the viewpoint of negative, the minimum transition probability that can be reached in a certain number of steps is used to describe the relation between two points. On the contrary, the features that least maintain the loose data structure are selected. And the two ways can also be combined. Thus three algorithms are proposed. Our main contributions are a novel feature section approach which uses multi-step transition probability to characterize the data structure, and three algorithms proposed from the positive and negative aspects for keeping data structure. The performance of our approach is compared with the state-of-the-art methods on eight real-world data sets, and the experimental results show that the proposed MMFS is effective in unsupervised feature selection.

suggested questions

comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا