Estimation of Non-Normalized Mixture Models and Clustering Using Deep Representation


Abstract in English

We develop a general method for estimating a finite mixture of non-normalized models. Here, a non-normalized model is defined to be a parametric distribution with an intractable normalization constant. Existing methods for estimating non-normalized models without computing the normalization constant are not applicable to mixture models because they contain more than one intractable normalization constant. The proposed method is derived by extending noise contrastive estimation (NCE), which estimates non-normalized models by discriminating between the observed data and some artificially generated noise. We also propose an extension of NCE with multiple noise distributions. Then, based on the observation that conventional classification learning with neural networks is implicitly assuming an exponential family as a generative model, we introduce a method for clustering unlabeled data by estimating a finite mixture of distributions in an exponential family. Estimation of this mixture model is attained by the proposed extensions of NCE where the training data of neural networks are used as noise. Thus, the proposed method provides a probabilistically principled clustering method that is able to utilize a deep representation. Application to image clustering using a deep neural network gives promising results.

Download