ترغب بنشر مسار تعليمي؟ اضغط هنا

We establish verifiable conditions under which Metropolis Hastings (MH) algorithms with position-dependent proposal covariance matrix will or will not have geometric rate of convergence. Some of the diffusions based MH algorithms like Metropolis adju sted Langevin algorithms (MALA) and Pre-conditioned MALA (PCMALA) have position independent proposal variance. Whereas, for other variants of MALA like manifold MALA (MMALA), the proposal covariance matrix changes in every iteration. Thus, we provide conditions for geometric ergodicity of different variations of Langevin algorithms. These conditions are verified in the context of conditional simulation from the two most popular generalized linear mixed models (GLMMs), namely the binomial GLMM with logit link and the Poisson GLMM with log link. Empirical comparison in the framework of some spatial GLMMs shows that computationally less expensive PCMALA with an appropriately chosen pre-conditioning matrix may outperform MMALA.
Relevance vector machine (RVM) is a popular sparse Bayesian learning model typically used for prediction. Recently it has been shown that improper priors assumed on multiple penalty parameters in RVM may lead to an improper posterior. Currently in th e literature, the sufficient conditions for posterior propriety of RVM do not allow improper priors over the multiple penalty parameters. In this article, we propose a single penalty relevance vector machine (SPRVM) model in which multiple penalty parameters are replaced by a single penalty and we consider a semi Bayesian approach for fitting the SPRVM. The necessary and sufficient conditions for posterior propriety of SPRVM are more liberal than those of RVM and allow for several improper priors over the penalty parameter. Additionally, we also prove the geometric ergodicity of the Gibbs sampler used to analyze the SPRVM model and hence can estimate the asymptotic standard errors associated with the Monte Carlo estimate of the means of the posterior predictive distribution. Such a Monte Carlo standard error cannot be computed in the case of RVM, since the rate of convergence of the Gibbs sampler used to analyze RVM is not known. The predictive performance of RVM and SPRVM is compared by analyzing three real life datasets.
115 - Yalin Rao , Vivekananda Roy 2021
Logistic linear mixed model (LLMM) is one of the most widely used statistical models. Generally, Markov chain Monte Carlo algorithms are used to explore the posterior densities associated with Bayesian LLMMs. Polson, Scott and Windles (2013) Polya-Ga mma data augmentation (DA) technique can be used to construct full Gibbs (FG) samplers for LLMMs. Here, we develop efficient block Gibbs (BG) samplers for Bayesian LLMMs using the Polya-Gamma DA method. We compare the FG and BG samplers in the context of simulated and real data examples as the correlation between the fixed and random effects changes as well as when the dimensions of the design matrices vary. These numerical examples demonstrate superior performance of the BG samplers over the FG samplers. We also derive conditions guaranteeing geometric ergodicity of the BG Markov chain when the popular improper uniform prior is assigned on the regression coefficients and proper or improper priors are placed on the variance parameters of the random effects. This theoretical result has important practical implications as it justifies the use of asymptotically valid Monte Carlo standard errors for Markov chain based estimates of posterior quantities.
Sparse Bayesian learning models are typically used for prediction in datasets with significantly greater number of covariates than observations. Such models often take a reproducing kernel Hilbert space (RKHS) approach to carry out the task of predic tion and can be implemented using either proper or improper priors. In this article we show that a few sparse Bayesian learning models in the literature, when implemented using improper priors, lead to improper posteriors.
We develop a Bayesian variable selection method, called SVEN, based on a hierarchical Gaussian linear model with priors placed on the regression coefficients as well as on the model space. Sparsity is achieved by using degenerate spike priors on inac tive variables, whereas Gaussian slab priors are placed on the coefficients for the important predictors making the posterior probability of a model available in explicit form (up to a normalizing constant). The strong model selection consistency is shown to be attained when the number of predictors grows nearly exponentially with the sample size and even when the norm of mean effects solely due to the unimportant variables diverge, which is a novel attractive feature. An appealing byproduct of SVEN is the construction of novel model weight adjusted prediction intervals. Embedding a unique model based screening and using fast Cholesky updates, SVEN produces a highly scalable computational framework to explore gigantic model spaces, rapidly identify the regions of high posterior probabilities and make fast inference and prediction. A temperature schedule guided by our model selection consistency derivations is used to further mitigate multimodal posterior distributions. The performance of SVEN is demonstrated through a number of simulation experiments and a real data example from a genome wide association study with over half a million markers.
106 - Vivekananda Roy 2019
Markov chain Monte Carlo (MCMC) is one of the most useful approaches to scientific computing because of its flexible construction, ease of use and generality. Indeed, MCMC is indispensable for performing Bayesian analysis. Two critical questions that MCMC practitioners need to address are where to start and when to stop the simulation. Although a great amount of research has gone into establishing convergence criteria and stopping rules with sound theoretical foundation, in practice, MCMC users often decide convergence by applying empirical diagnostic tools. This review article discusses the most widely used MCMC convergence diagnostic tools. Some recently proposed stopping rules with firm theoretical footing are also presented. The convergence diagnostics and stopping rules are illustrated using three detailed examples.
The standard importance sampling (IS) estimator, generally does not work well in examples involving simultaneous inference on several targets as the importance weights can take arbitrarily large values making the estimator highly unstable. In such si tuations, alternative generalized IS estimators involving samples from multiple proposal distributions are preferred. Just like the standard IS, the success of these multiple IS estimators crucially depends on the choice of the proposal distributions. The selection of these proposal distributions is the focus of this article. We propose three methods based on (i) a geometric space filling coverage criterion, (ii) a minimax variance approach, and (iii) a maximum entropy approach. The first two methods are applicable to any multi-proposal IS estimator, whereas the third approach is described in the context of Dosss (2010) two-stage IS estimator. For the first method we propose a suitable measure of coverage based on the symmetric Kullback-Leibler divergence, while the second and third approaches use estimates of asymptotic variances of Dosss (2010) IS estimator and Geyers (1994) reverse logistic estimator, respectively. Thus, we provide consistent spectral variance estimators for these asymptotic variances. The proposed methods for selecting proposal densities are illustrated using various detailed examples.
Spatial generalized linear mixed models (SGLMMs) are popular for analyzing non-Gaussian spatial data. These models assume a prescribed link function that relates the underlying spatial field with the mean response. There are circumstances, such as wh en the data contain outlying observations, where the use of a prescribed link function can result in poor fit, which can be improved by using a parametric link function. Some popular link functions, such as the Box-Cox, are unsuitable because they are inconsistent with the Gaussian assumption of the spatial field. We present sensible choices of parametric link functions which possess desirable properties. It is important to estimate the parameters of the link function, rather than assume a known value. To that end, we present a generalized importance sampling (GIS) estimator based on multiple Markov chains for empirical Bayes analysis of SGLMMs. The GIS estimator, although more efficient than the simple importance sampling, can be highly variable when used to estimate the parameters of certain link functions. Using suitable reparameterizations of the Monte Carlo samples, we propose modified GIS estimators that do not suffer from high variability. We use Laplace approximation for choosing the multiple importance densities in the GIS estimator. Finally, we develop a methodology for selecting models with appropriate link function family, which extends to choosing a spatial correlation function as well. We present an ensemble prediction of the mean response by appropriately weighting the estimates from different models. The proposed methodology is illustrated using simulated and real data examples.
The logistic regression model is the most popular model for analyzing binary data. In the absence of any prior information, an improper flat prior is often used for the regression coefficients in Bayesian logistic regression models. The resulting int ractable posterior density can be explored by running Polson et al.s (2013) data augmentation (DA) algorithm. In this paper, we establish that the Markov chain underlying Polson et al.s (2013) DA algorithm is geometrically ergodic. Proving this theoretical result is practically important as it ensures the existence of central limit theorems (CLTs) for sample averages under a finite second moment condition. The CLT in turn allows users of the DA algorithm to calculate standard errors for posterior estimates.
Rank data arises frequently in marketing, finance, organizational behavior, and psychology. Most analysis of rank data reported in the literature assumes the presence of one or more variables (sometimes latent) based on whose values the items are ran ked. In this paper we analyze rank data using a purely probabilistic model where the observed ranks are assumed to be perturbe
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا