Bayesian Inference for State Space Models using Block and Correlated Pseudo Marginal Methods

67 0 0.0 ( 0 )

Download Cite

Added by Minh-Ngoc Tran

Publication date 2016

fields Mathematical Statistics

and research's language is English

Authors P. Choppala - D. Gunawan - J. Chen

Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This article addresses the problem of efficient Bayesian inference in dynamic systems using particle methods and makes a number of contributions. First, we develop a correlated pseudo-marginal (CPM) approach for Bayesian inference in state space (SS) models that is based on filtering the disturbances, rather than the states. This approach is useful when the state transition density is intractable or inefficient to compute, and also when the dimension of the disturbance is lower than the dimension of the state. Second, we propose a block pseudo-marginal (BPM) method that uses as the estimate of the likelihood the average of G independent unbiased estimates of the likelihood. We associate a set of underlying uniform of standard normal random numbers used to construct each of the individual unbiased likelihood estimates and then use component-wise Markov Chain Monte Carlo to update the parameter vector jointly with one set of these random numbers at a time. This induces a correlation of approximately 1-1/G between the logs of the estimated likelihood at the proposed and current values of the model parameters. Third, we show for some non-stationary state space models that the BPM approach is much more efficient than the CPM approach, because it is difficult to translate the high correlation in the underlying random numbers to high correlation between the logs of the likelihood estimates. Although our focus has been on applying the BPM method to state space models, our results and approach can be used in a wide range of applications of the PM method, such as panel data models, subsampling problems and approximate Bayesian computation.

rate research

Efficient inference for stochastic differential equation mixed-effects models using correlated particle pseudo-marginal algorithms

412 - Samuel Wiqvist , Andrew Golightly , Ashleigh T. McLean 2019

Stochastic differential equation mixed-effects models (SDEMEMs) are flexible hierarchical models that are able to account for random variability inherent in the underlying time-dynamics, as well as the variability between experimental units and, optionally, account for measurement error. Fully Bayesian inference for state-space SDEMEMs is performed, using data at discrete times that may be incomplete and subject to measurement error. However, the inference problem is complicated by the typical intractability of the observed data likelihood which motivates the use of sampling-based approaches such as Markov chain Monte Carlo. A Gibbs sampler is proposed to target the marginal posterior of all parameter values of interest. The algorithm is made computationally efficient through careful use of blocking strategies and correlated pseudo-marginal Metropolis-Hastings steps within the Gibbs scheme. The resulting methodology is flexible and is able to deal with a large class of SDEMEMs. The methodology is demonstrated on three case studies, including tumor growth dynamics and neuronal data. The gains in terms of increased computational efficiency are model and data dependent, but unless bespoke sampling strategies requiring analytical derivations are possible for a given model, we generally observe an efficiency increase of one order of magnitude when using correlated particle methods together with our blocked-Gibbs strategy.

Computation Methodology

Approximate Methods for State-Space Models

541 - Shinsuke Koyama , Lucia Castellanos Perez-Bolde , Cosma Rohillan Shalizi 2010

State-space models provide an important body of techniques for analyzing time-series, but their use requires estimating unobserved states. The optimal estimate of the state is its conditional expectation given the observation histories, and computing this expectation is hard when there are nonlinearities. Existing filtering methods, including sequential Monte Carlo, tend to be either inaccurate or slow. In this paper, we study a nonlinear filter for nonlinear/non-Gaussian state-space models, which uses Laplaces method, an asymptotic series expansion, to approximate the states conditional mean and variance, together with a Gaussian conditional distribution. This {em Laplace-Gaussian filter} (LGF) gives fast, recursive, deterministic state estimates, with an error which is set by the stochastic characteristics of the model and is, we show, stable over time. We illustrate the estimation ability of the LGF by applying it to the problem of neural decoding and compare it to sequential Monte Carlo both in simulations and with real data. We find that the LGF can deliver superior results in a small fraction of the computing time.

Methodology Data Analysis Statistics and Probability Neurons and Cognition

Bayesian Inference for Gamma Models

102 - Jingyu He , Nicholas Polson , Jianeng Xu 2021

We use the theory of normal variance-mean mixtures to derive a data augmentation scheme for models that include gamma functions. Our methodology applies to many situations in statistics and machine learning, including Multinomial-Dirichlet distributions, Negative binomial regression, Poisson-Gamma hierarchical models, Extreme value models, to name but a few. All of those models include a gamma function which does not admit a natural conjugate prior distribution providing a significant challenge to inference and prediction. To provide a data augmentation strategy, we construct and develop the theory of the class of Exponential Reciprocal Gamma distributions. This allows scalable EM and MCMC algorithms to be developed. We illustrate our methodology on a number of examples, including gamma shape inference, negative binomial regression and Dirichlet allocation. Finally, we conclude with directions for future research.

Methodology Computation Machine Learning

Hierarchical Bayesian Mixture Models for Time Series Using Context Trees as State Space Partitions

90 - Ioannis Papageorgiou , Ioannis Kontoyiannis 2021

A general Bayesian framework is introduced for mixture modelling and inference with real-valued time series. At the top level, the state space is partitioned via the choice of a discrete context tree, so that the resulting partition depends on the values of some of the most recent samples. At the bottom level, a different model is associated with each region of the partition. This defines a very rich and flexible class of mixture models, for which we provide algorithms that allow for efficient, exact Bayesian inference. In particular, we show that the maximum a posteriori probability (MAP) model (including the relevant MAP context tree partition) can be precisely identified, along with its exact posterior probability. The utility of this general framework is illustrated in detail when a different autoregressive (AR) model is used in each state-space region, resulting in a mixture-of-AR model class. The performance of the associated algorithmic tools is demonstrated in the problems of model selection and forecasting on both simulated and real-world data, where they are found to provide results as good or better than state-of-the-art methods.

Methodology Applications Machine Learning

Pseudo-marginal Metropolis--Hastings using averages of unbiased estimators

66 - Chris Sherlock , Alexandre Thiery , Anthony Lee 2016

We consider a pseudo-marginal Metropolis--Hastings kernel $P_m$ that is constructed using an average of $m$ exchangeable random variables, as well as an analogous kernel $P_s$ that averages $s<m$ of these same random variables. Using an embedding technique to facilitate comparisons, we show that the asymptotic variances of ergodic averages associated with $P_m$ are lower bounded in terms of those associated with $P_s$. We show that the bound provided is tight and disprove a conjecture that when the random variables to be averaged are independent, the asymptotic variance under $P_m$ is never less than $s/m$ times the variance under $P_s$. The conjecture does, however, hold when considering continuous-time Markov chains. These results imply that if the computational cost of the algorithm is proportional to $m$, it is often better to set $m=1$. We provide intuition as to why these findings differ so markedly from recent results for pseudo-marginal kernels employing particle filter approximations. Our results are exemplified through two simulation studies; in the first the computational cost is effectively proportional to $m$ and in the second there is a considerable start-up cost at each iteration.

Methodology Probability Computation