A survey on Bayesian inference for Gaussian mixture model


Abstract in English

Clustering has become a core technology in machine learning, largely due to its application in the field of unsupervised learning, clustering, classification, and density estimation. A frequentist approach exists to hand clustering based on mixture model which is known as the EM algorithm where the parameters of the mixture model are usually estimated into a maximum likelihood estimation framework. Bayesian approach for finite and infinite Gaussian mixture model generates point estimates for all variables as well as associated uncertainty in the form of the whole estimates posterior distribution. The sole aim of this survey is to give a self-contained introduction to concepts and mathematical tools in Bayesian inference for finite and infinite Gaussian mixture model in order to seamlessly introduce their applications in subsequent sections. However, we clearly realize our inability to cover all the useful and interesting results concerning this field and given the paucity of scope to present this discussion, e.g., the separated analysis of the generation of Dirichlet samples by stick-breaking and Polyas Urn approaches. We refer the reader to literature in the field of the Dirichlet process mixture model for a much detailed introduction to the related fields. Some excellent examples include (Frigyik et al., 2010; Murphy, 2012; Gelman et al., 2014; Hoff, 2009). This survey is primarily a summary of purpose, significance of important background and techniques for Gaussian mixture model, e.g., Dirichlet prior, Chinese restaurant process, and most importantly the origin and complexity of the methods which shed light on their modern applications. The mathematical prerequisite is a first course in probability. Other than this modest background, the development is self-contained, with rigorous proofs provided throughout.

Download