New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Ultimate Polya Gamma Samplers -- Efficient MCMC for possibly imbalanced binary and categorical data

118 0 0.0 ( 0 )

Download Cite

Added by Gregor Zens

Publication date 2020

fields Mathematical Statistics

and research's language is English

Authors Sylvia Fruhwirth-Schnatter - Gregor Zens - Helga Wagner

Computation Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Modeling binary and categorical data is one of the most commonly encountered tasks of applied statisticians and econometricians. While Bayesian methods in this context have been available for decades now, they often require a high level of familiarity with Bayesian statistics or suffer from issues such as low sampling efficiency. To contribute to the accessibility of Bayesian models for binary and categorical data, we introduce novel latent variable representations based on Polya Gamma random variables for a range of commonly encountered discrete choice models. From these latent variable representations, new Gibbs sampling algorithms for binary, binomial and multinomial logistic regression models are derived. All models allow for a conditionally Gaussian likelihood representation, rendering extensions to more complex modeling frameworks such as state space models straight-forward. However, sampling efficiency may still be an issue in these data augmentation based estimation frameworks. To counteract this, MCMC boosting strategies are developed and discussed in detail. The merits of our approach are illustrated through extensive simulations and a real data application.

rate research

Efficient Bayesian Modeling of Binary and Categorical Data in R: The UPG Package

263 - Gregor Zens , Sylvia Fruhwirth-Schnatter , Helga Wagner 2021

We introduce the UPG package for highly efficient Bayesian inference in probit, logit, multinomial logit and binomial logit models. UPG offers a convenient estimation framework for balanced and imbalanced data settings where sampling efficiency is ensured through Markov chain Monte Carlo boosting methods. All sampling algorithms are implemented in C++, allowing for rapid parameter estimation. In addition, UPG provides several methods for fast production of output tables and summary plots that are easily accessible to a broad range of users.

Computation Methodology

MCMC-driven importance samplers

182 - F. Llorente , E. Curbelo , L. Martino 2021

Monte Carlo methods are the standard procedure for estimating complicated integrals of multidimensional Bayesian posterior distributions. In this work, we focus on LAIS, a class of adaptive importance samplers where Markov chain Monte Carlo (MCMC) algorithms are employed to drive an underlying multiple importance sampling (IS) scheme. Its power lies in the simplicity of the layered framework: the upper layer locates proposal densities by means of MCMC algorithms; while the lower layer handles the multiple IS scheme, in order to compute the final estimators. The modular nature of LAIS allows for different possible choices in the upper and lower layers, that will have different performance and computational costs. In this work, we propose different enhancements in order to increase the efficiency and reduce the computational cost, of both upper and lower layers. The different variants are essential if we aim to address computational challenges arising in real-world applications, such as highly concentrated posterior distributions (due to large amounts of data, etc.). Hamiltonian-driven importance samplers are presented and tested. Furthermore, we introduce different strategies for designing cheaper schemes, for instance, recycling samples generated in the upper layer and using them in the final estimators in the lower layer. Numerical experiments show the benefits of the proposed schemes as compared to the vanilla version of LAIS and other benchmark methods.

Computation Machine Learning

Efficient MCMC for Gibbs Random Fields using pre-computation

75 - Aidan Boland , Nial Friel , Florian Maire 2017

Bayesian inference of Gibbs random fields (GRFs) is often referred to as a doubly intractable problem, since the likelihood function is intractable. The exploration of the posterior distribution of such models is typically carried out with a sophisticated Markov chain Monte Carlo (MCMC) method, the exchange algorithm (Murray et al., 2006), which requires simulations from the likelihood function at each iteration. The purpose of this paper is to consider an approach to dramatically reduce this computational overhead. To this end we introduce a novel class of algorithms which use realizations of the GRF model, simulated offline, at locations specified by a grid that spans the parameter space. This strategy speeds up dramatically the posterior inference, as illustrated on several examples. However, using the pre-computed graphs introduces a noise in the MCMC algorithm, which is no longer exact. We study the theoretical behaviour of the resulting approximate MCMC algorithm and derive convergence bounds using a recent theoretical development on approximate MCMC methods.

Computation Methodology

Adaptive step size selection for Hessian-based manifold Langevin samplers

213 - Tore Selland Kleppe 2015

The usage of positive definite metric tensors derived from second derivative information in the context of the simplified manifold Metropolis adjusted Langevin algorithm (MALA) is explored. A new adaptive step length procedure that resolves the shortcomings of such metric tensors in regions where the log-target has near zero curvature in some direction is proposed. The adaptive step length selection also appears to alleviate the need for different tuning parameters in transient and stationary regimes that is typical of MALA. The combination of metric tensors derived from second derivative information and adaptive step length selection constitute a large step towards developing reliable manifold MCMC methods that can be implemented automatically for models with unknown or intractable Fisher information, and even for target distributions that do not admit factorization into prior and likelihood. Through examples of low to moderate dimension, it is shown that proposed methodology performs very well relative to alternative MCMC methods.

Computation Methodology

Faster MCMC for Gaussian Latent Position Network Models

105 - Neil A. Spencer , Brian Junker , Tracy M. Sweet 2020

Latent position network models are a versatile tool in network science; applications include clustering entities, controlling for causal confounders, and defining priors over unobserved graphs. Estimating each nodes latent position is typically framed as a Bayesian inference problem, with Metropolis within Gibbs being the most popular tool for approximating the posterior distribution. However, it is well-known that Metropolis within Gibbs is inefficient for large networks; the acceptance ratios are expensive to compute, and the resultant posterior draws are highly correlated. In this article, we propose an alternative Markov chain Monte Carlo strategy---defined using a combination of split Hamiltonian Monte Carlo and Firefly Monte Carlo---that leverages the posterior distributions functional form for more efficient posterior computation. We demonstrate that these strategies outperform Metropolis within Gibbs and other algorithms on synthetic networks, as well as on real information-sharing networks of teachers and staff in a school district.

Computation Methodology Machine Learning

comments

Fetching comments

Middle East University- Jordan

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Ultimate Polya Gamma Samplers -- Efficient MCMC for possibly imbalanced binary and categorical data

Ask ChatGPT about the research

No Arabic abstract

Read More