No Arabic abstract
The morphology of a radio galaxy is highly affected by its central active galactic nuclei (AGN), which is studied to reveal the evolution of the super massive black hole (SMBH). In this work, we propose a morphology generation framework for two typical radio galaxies namely Fanaroff-Riley type-I (FRI) and type-II (FRII) with deep neural network based autoencoder (DNNAE) and Gaussian mixture models (GMMs). The encoder and decoder subnets in the DNNAE are symmetric aside a fully-connected layer namely code layer hosting the extracted feature vectors. By randomly generating the feature vectors later with a three-component Gaussian Mixture models, new FRI or FRII radio galaxy morphologies are simulated. Experiments were demonstrated on real radio galaxy images, where we discussed the length of feature vectors, selection of lost functions, and made comparisons on batch normalization and dropout techniques for training the network. The results suggest a high efficiency and performance of our morphology generation framework. Code is available at: https://github.com/myinxd/dnnae-gmm.
Variation Autoencoder (VAE) has become a powerful tool in modeling the non-linear generative process of data from a low-dimensional latent space. Recently, several studies have proposed to use VAE for unsupervised clustering by using mixture models to capture the multi-modal structure of latent representations. This strategy, however, is ineffective when there are outlier data samples whose latent representations are meaningless, yet contaminating the estimation of key major clusters in the latent space. This exact problem arises in the context of resting-state fMRI (rs-fMRI) analysis, where clustering major functional connectivity patterns is often hindered by heavy noise of rs-fMRI and many minor clusters (rare connectivity patterns) of no interest to analysis. In this paper we propose a novel generative process, in which we use a Gaussian-mixture to model a few major clusters in the data, and use a non-informative uniform distribution to capture the remaining data. We embed this truncated Gaussian-Mixture model in a Variational AutoEncoder framework to obtain a general joint clustering and outlier detection approach, called tGM-VAE. We demonstrated the applicability of tGM-VAE on the MNIST dataset and further validated it in the context of rs-fMRI connectivity analysis.
Variational autoencoders (VAEs) have been shown to be able to generate game levels but require manual exploration of the learned latent space to generate outputs with desired attributes. While conditional VAEs address this by allowing generation to be conditioned on labels, such labels have to be provided during training and thus require prior knowledge which may not always be available. In this paper, we apply Gaussian Mixture VAEs (GMVAEs), a variant of the VAE which imposes a mixture of Gaussians (GM) on the latent space, unlike regular VAEs which impose a unimodal Gaussian. This allows GMVAEs to cluster levels in an unsupervised manner using the components of the GM and then generate new levels using the learned components. We demonstrate our approach with levels from Super Mario Bros., Kid Icarus and Mega Man. Our results show that the learned components discover and cluster level structures and patterns and can be used to generate levels with desired characteristics.
Gaussian mixture models (GMM) are powerful parametric tools with many applications in machine learning and computer vision. Expectation maximization (EM) is the most popular algorithm for estimating the GMM parameters. However, EM guarantees only convergence to a stationary point of the log-likelihood function, which could be arbitrarily worse than the optimal solution. Inspired by the relationship between the negative log-likelihood function and the Kullback-Leibler (KL) divergence, we propose an alternative formulation for estimating the GMM parameters using the sliced Wasserstein distance, which gives rise to a new algorithm. Specifically, we propose minimizing the sliced-Wasserstein distance between the mixture model and the data distribution with respect to the GMM parameters. In contrast to the KL-divergence, the energy landscape for the sliced-Wasserstein distance is more well-behaved and therefore more suitable for a stochastic gradient descent scheme to obtain the optimal GMM parameters. We show that our formulation results in parameter estimates that are more robust to random initializations and demonstrate that it can estimate high-dimensional data distributions more faithfully than the EM algorithm.
In this paper we address the problem of building a class of robust factorization algorithms that solve for the shape and motion parameters with both affine (weak perspective) and perspective camera models. We introduce a Gaussian/uniform mixture model and its associated EM algorithm. This allows us to address robust parameter estimation within a data clustering approach. We propose a robust technique that works with any affine factorization method and makes it robust to outliers. In addition, we show how such a framework can be further embedded into an iterative perspective factorization scheme. We carry out a large number of experiments to validate our algorithms and to compare them with existing ones. We also compare our approach with factorization methods that use M-estimators.
Heterogeneity of sentences exists in sequence to sequence tasks such as machine translation. Sentences with largely varied meanings or grammatical structures may increase the difficulty of convergence while training the network. In this paper, we introduce a model to resolve the heterogeneity in the sequence to sequence task. The Multi-filter Gaussian Mixture Autoencoder (MGMAE) utilizes an autoencoder to learn the representations of the inputs. The representations are the outputs from the encoder, lying in the latent space whose dimension is the hidden dimension of the encoder. The representations of training data in the latent space are used to train Gaussian mixtures. The latent space representations are divided into several mixtures of Gaussian distributions. A filter (decoder) is tuned to fit the data in one of the Gaussian distributions specifically. Each Gaussian is corresponding to one filter so that the filter is responsible for the heterogeneity within this Gaussian. Thus the heterogeneity of the training data can be resolved. Comparative experiments are conducted on the Geo-query dataset and English-French translation. Our experiments show that compares to the traditional encoder-decoder model, this network achieves better performance on sequence to sequence tasks such as machine translation and question answering.