Dirichlet Process Parsimonious Mixtures for clustering


Abstract in English

The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtures (DPPM) which represent a Bayesian nonparametric formulation of these parsimonious Gaussian mixture models. The proposed DPPM models are Bayesian nonparametric parsimonious mixture models that allow to simultaneously infer the model parameters, the optimal number of mixture components and the optimal parsimonious mixture structure from the data. We develop a Gibbs sampling technique for maximum a posteriori (MAP) estimation of the developed DPMM models and provide a Bayesian model selection framework by using Bayes factors. We apply them to cluster simulated data and real data sets, and compare them to the standard parsimonious mixture models. The obtained results highlight the effectiveness of the proposed nonparametric parsimonious mixture models as a good nonparametric alternative for the parametric parsimonious models.

Download