No Arabic abstract
This paper proposes a family of weighted batch means variance estimators, which are computationally efficient and can be conveniently applied in practice. The focus is on Markov chain Monte Carlo simulations and estimation of the asymptotic covariance matrix in the Markov chain central limit theorem, where conditions ensuring strong consistency are provided. Finite sample performance is evaluated through auto-regressive, Bayesian spatial-temporal, and Bayesian logistic regression examples, where the new estimators show significant computational gains with a minor sacrifice in variance compared with existing methods.
Calculating a Monte Carlo standard error (MCSE) is an important step in the statistical analysis of the simulation output obtained from a Markov chain Monte Carlo experiment. An MCSE is usually based on an estimate of the variance of the asymptotic normal distribution. We consider spectral and batch means methods for estimating this variance. In particular, we establish conditions which guarantee that these estimators are strongly consistent as the simulation effort increases. In addition, for the batch means and overlapping batch means methods we establish conditions ensuring consistency in the mean-square sense which in turn allows us to calculate the optimal batch size up to a constant of proportionality. Finally, we examine the empirical finite-sample properties of spectral variance and batch means estimators and provide recommendations for practitioners.
Markov chain Monte Carlo (MCMC) produces a correlated sample for estimating expectations with respect to a target distribution. A fundamental question is when should sampling stop so that we have good estimates of the desired quantities? The key to answering this question lies in assessing the Monte Carlo error through a multivariate Markov chain central limit theorem (CLT). The multivariate nature of this Monte Carlo error largely has been ignored in the MCMC literature. We present a multivariate framework for terminating simulation in MCMC. We define a multivariate effective sample size, estimating which requires strongly consistent estimators of the covariance matrix in the Markov chain CLT; a property we show for the multivariate batch means estimator. We then provide a lower bound on the number of minimum effective samples required for a desired level of precision. This lower bound depends on the problem only in the dimension of the expectation being estimated, and not on the underlying stochastic process. This result is obtained by drawing a connection between terminating simulation via effective sample size and terminating simulation using a relative standard deviation fixed-volume sequential stopping rule; which we demonstrate is an asymptotically valid procedure. The finite sample properties of the proposed method are demonstrated in a variety of examples.
We consider batch size selection for a general class of multivariate batch means variance estimators, which are computationally viable for high-dimensional Markov chain Monte Carlo simulations. We derive the asymptotic mean squared error for this class of estimators. Further, we propose a parametric technique for estimating optimal batch sizes and discuss practical issues regarding the estimating process. Vector auto-regressive, Bayesian logistic regression, and Bayesian dynamic space-time examples illustrate the quality of the estimation procedure where the proposed optimal batch sizes outperform current batch size selection methods.
We propose Adaptive Incremental Mixture Markov chain Monte Carlo (AIMM), a novel approach to sample from challenging probability distributions defined on a general state-space. While adaptive MCMC methods usually update a parametric proposal kernel with a global rule, AIMM locally adapts a semiparametric kernel. AIMM is based on an independent Metropolis-Hastings proposal distribution which takes the form of a finite mixture of Gaussian distributions. Central to this approach is the idea that the proposal distribution adapts to the target by locally adding a mixture component when the discrepancy between the proposal mixture and the target is deemed to be too large. As a result, the number of components in the mixture proposal is not fixed in advance. Theoretically, we prove that there exists a process that can be made arbitrarily close to AIMM and that converges to the correct target distribution. We also illustrate that it performs well in practice in a variety of challenging situations, including high-dimensional and multimodal target distributions.
Markov chain Monte Carlo (MCMC) is a simulation method commonly used for estimating expectations with respect to a given distribution. We consider estimating the covariance matrix of the asymptotic multivariate normal distribution of a vector of sample means. Geyer (1992) developed a Monte Carlo error estimation method for estimating a univariate mean. We propose a novel multivariate version of Geyers method that provides an asymptotically valid estimator for the covariance matrix and results in stable Monte Carlo estimates. The finite sample properties of the proposed method are investigated via simulation experiments.