Do you want to publish a course? Click here

Approximating the marginal likelihood in mixture models

173   0   0.0 ( 0 )
 Added by Christian Robert
 Publication date 2008
and research's language is English




Ask ChatGPT about the research

In Chib (1995), a method for approximating marginal densities in a Bayesian setting is proposed, with one proeminent application being the estimation of the number of components in a normal mixture. As pointed out in Neal (1999) and Fruhwirth-Schnatter (2004), the approximation often fails short of providing a proper approximation to the true marginal densities because of the well-known label switching problem (Celeux et al., 2000). While there exist other alternatives to the derivation of approximate marginal densities, we reconsider the original proposal here and show as in Berkhof et al. (2003) and Lee et al. (2008) that it truly approximates the marginal densities once the label switching issue has been solved.



rate research

Read More

119 - Umberto Picchini 2016
A maximum likelihood methodology for the parameters of models with an intractable likelihood is introduced. We produce a likelihood-free version of the stochastic approximation expectation-maximization (SAEM) algorithm to maximize the likelihood function of model parameters. While SAEM is best suited for models having a tractable complete likelihood function, its application to moderately complex models is a difficult or even impossible task. We show how to construct a likelihood-free version of SAEM by using the synthetic likelihood paradigm. Our method is completely plug-and-play, requires almost no tuning and can be applied to both static and dynamic models. Four simulation studies illustrate the method, including a stochastic differential equation model, a stochastic Lotka-Volterra model and data from $g$-and-$k$ distributions. MATLAB code is available as supplementary material.
96 - Gilles Celeux , 2018
Determining the number G of components in a finite mixture distribution is an important and difficult inference issue. This is a most important question, because statistical inference about the resulting model is highly sensitive to the value of G. Selecting an erroneous value of G may produce a poor density estimate. This is also a most difficult question from a theoretical perspective as it relates to unidentifiability issues of the mixture model. This is further a most relevant question from a practical viewpoint since the meaning of the number of components G is strongly related to the modelling purpose of a mixture distribution. We distinguish in this chapter between selecting G as a density estimation problem in Section 2 and selecting G in a model-based clustering framework in Section 3. Both sections discuss frequentist as well as Bayesian approaches. We present here some of the Bayesian solutions to the different interpretations of picking the right number of components in a mixture, before concluding on the ill-posed nature of the question.
131 - Edouard Ollier 2021
Nonlinear Mixed effects models are hidden variables models that are widely used in many field such as pharmacometrics. In such models, the distribution characteristics of hidden variables can be specified by including several parameters such as covariates or correlations which must be selected. Recent development of pharmacogenomics has brought averaged/high dimensional problems to the field of nonlinear mixed effects modeling for which standard covariates selection techniques like stepwise methods are not well suited. This work proposes to select covariates and correlation parameters using a penalized likelihood approach. The penalized likelihood problem is solved using a stochastic proximal gradient algorithm to avoid inner-outer iterations. Speed of convergence of the proximal gradient algorithm is improved by the use of component-wise adaptive gradient step sizes. The practical implementation and tuning of the proximal gradient algorithm is explored using simulations. Calibration of regularization parameters is performed by minimizing the Bayesian Information Criterion using particle swarm optimization, a zero order optimization procedure. The use of warm restart and parallelization allows to reduce significantly computing time. The performance of the proposed method compared to the traditional grid search strategy is explored using simulated data. Finally, an application to real data from two pharmacokinetics studies is provided, one studying an antifibrinolitic and the other studying an antibiotic.
Gaussian latent tree models, or more generally, Gaussian latent forest models have Fisher-information matrices that become singular along interesting submodels, namely, models that correspond to subforests. For these singularities, we compute the real log-canonical thresholds (also known as stochastic complexities or learning coefficients) that quantify the large-sample behavior of the marginal likelihood in Bayesian inference. This provides the information needed for a recently introduced generalization of the Bayesian information criterion. Our mathematical developments treat the general setting of Laplace integrals whose phase functions are sums of squared differences between monomials and constants. We clarify how in this case real log-canonical thresholds can be computed using polyhedral geometry, and we show how to apply the general theory to the Laplace integrals associated with Gaussian latent tree and forest models. In simulations and a data example, we demonstrate how the mathematical knowledge can be applied in model selection.
126 - Long Feng , Lee H. Dicker 2016
Nonparametric maximum likelihood (NPML) for mixture models is a technique for estimating mixing distributions that has a long and rich history in statistics going back to the 1950s, and is closely related to empirical Bayes methods. Historically, NPML-based methods have been considered to be relatively impractical because of computational and theoretical obstacles. However, recent work focusing on approximate NPML methods suggests that these methods may have great promise for a variety of modern applications. Building on this recent work, a class of flexible, scalable, and easy to implement approximate NPML methods is studied for problems with multivariate mixing distributions. Concrete guidance on implementing these methods is provided, with theoretical and empirical support; topics covered include identifying the support set of the mixing distribution, and comparing algorithms (across a variety of metrics) for solving the simple convex optimization problem at the core of the approximate NPML problem. Additionally, three diverse real data applications are studied to illustrate the methods performance: (i) A baseball data analysis (a classical example for empirical Bayes methods), (ii) high-dimensional microarray classification, and (iii) online prediction of blood-glucose density for diabetes patients. Among other things, the empirical results demonstrate the relative effectiveness of using multivariate (as opposed to univariate) mixing distributions for NPML-based approaches.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا