No Arabic abstract
In Bayesian inference, predictive distributions are typically in the form of samples generated via Markov chain Monte Carlo (MCMC) or related algorithms. In this paper, we conduct a systematic analysis of how to make and evaluate probabilistic forecasts from such simulation output. Based on proper scoring rules, we develop a notion of consistency that allows to assess the adequacy of methods for estimating the stationary distribution underlying the simulation output. We then provide asymptotic results that account for the salient features of Bayesian posterior simulators, and derive conditions under which choices from the literature satisfy our notion of consistency. Importantly, these conditions depend on the scoring rule being used, such that the choices of approximation method and scoring rule are intertwined. While the logarithmic rule requires fairly stringent conditions, the continuous ranked probability score (CRPS) yields consistent approximations under minimal assumptions. These results are illustrated in a simulation study and an economic data example. Overall, mixture-of-parameters approximations which exploit the parametric structure of Bayesian models perform particularly well. Under the CRPS, the empirical distribution function is a simple and appealing alternative option.
A novel class of non-reversible Markov chain Monte Carlo schemes relying on continuous-time piecewise-deterministic Markov Processes has recently emerged. In these algorithms, the state of the Markov process evolves according to a deterministic dynamics which is modified using a Markov transition kernel at random event times. These methods enjoy remarkable features including the ability to update only a subset of the state components while other components implicitly keep evolving and the ability to use an unbiased estimate of the gradient of the log-target while preserving the target as invariant distribution. However, they also suffer from important limitations. The deterministic dynamics used so far do not exploit the structure of the target. Moreover, exact simulation of the event times is feasible for an important yet restricted class of problems and, even when it is, it is application specific. This limits the applicability of these techniques and prevents the development of a generic software implementation of them. We introduce novel MCMC methods addressing these shortcomings. In particular, we introduce novel continuous-time algorithms relying on exact Hamiltonian flows and novel non-reversible discrete-time algorithms which can exploit complex dynamics such as approximate Hamiltonian dynamics arising from symplectic integrators while preserving the attractive features of continuous-time algorithms. We demonstrate the performance of these schemes on a variety of applications.
Markov chain Monte Carlo (MCMC) produces a correlated sample for estimating expectations with respect to a target distribution. A fundamental question is when should sampling stop so that we have good estimates of the desired quantities? The key to answering this question lies in assessing the Monte Carlo error through a multivariate Markov chain central limit theorem (CLT). The multivariate nature of this Monte Carlo error largely has been ignored in the MCMC literature. We present a multivariate framework for terminating simulation in MCMC. We define a multivariate effective sample size, estimating which requires strongly consistent estimators of the covariance matrix in the Markov chain CLT; a property we show for the multivariate batch means estimator. We then provide a lower bound on the number of minimum effective samples required for a desired level of precision. This lower bound depends on the problem only in the dimension of the expectation being estimated, and not on the underlying stochastic process. This result is obtained by drawing a connection between terminating simulation via effective sample size and terminating simulation using a relative standard deviation fixed-volume sequential stopping rule; which we demonstrate is an asymptotically valid procedure. The finite sample properties of the proposed method are demonstrated in a variety of examples.
Markov chain Monte Carlo (MCMC) is widely used for Bayesian inference in models of complex systems. Performance, however, is often unsatisfactory in models with many latent variables due to so-called poor mixing, necessitating development of application specific implementations. This paper introduces posterior-based proposals (PBPs), a new type of MCMC update applicable to a huge class of statistical models (whose conditional dependence structures are represented by directed acyclic graphs). PBPs generates large joint updates in parameter and latent variable space, whilst retaining good acceptance rates (typically 33%). Evaluation against other approaches (from standard Gibbs / random walk updates to state-of-the-art Hamiltonian and particle MCMC methods) was carried out for widely varying model types: an individual-based model for disease diagnostic test data, a financial stochastic volatility model, a mixed model used in statistical genetics and a population model used in ecology. Whilst different methods worked better or worse in different scenarios, PBPs were found to be either near to the fastest or significantly faster than the next best approach (by up to a factor of 10). PBPs therefore represent an additional general purpose technique that can be usefully applied in a wide variety of contexts.
We propose Adaptive Incremental Mixture Markov chain Monte Carlo (AIMM), a novel approach to sample from challenging probability distributions defined on a general state-space. While adaptive MCMC methods usually update a parametric proposal kernel with a global rule, AIMM locally adapts a semiparametric kernel. AIMM is based on an independent Metropolis-Hastings proposal distribution which takes the form of a finite mixture of Gaussian distributions. Central to this approach is the idea that the proposal distribution adapts to the target by locally adding a mixture component when the discrepancy between the proposal mixture and the target is deemed to be too large. As a result, the number of components in the mixture proposal is not fixed in advance. Theoretically, we prove that there exists a process that can be made arbitrarily close to AIMM and that converges to the correct target distribution. We also illustrate that it performs well in practice in a variety of challenging situations, including high-dimensional and multimodal target distributions.
A novel strategy that combines a given collection of reversible Markov kernels is proposed. It consists in a Markov chain that moves, at each iteration, according to one of the available Markov kernels selected via a state-dependent probability distribution which is thus dubbed locally informed. In contrast to random-scan approaches that assume a constant selection probability distribution, the state-dependent distribution is typically specified so as to privilege moving according to a kernel which is relevant for the local topology of the target distribution. The second contribution is to characterize situations where a locally informed strategy should be preferred to its random-scan counterpart. We find that for a specific class of target distribution, referred to as sparse and filamentary, that exhibits a strong correlation between some variables and/or which concentrates its probability mass on some low dimensional linear subspaces or on thinned curved manifolds, a locally informed strategy converges substantially faster and yields smaller asymptotic variances than an equivalent random-scan algorithm. The research is at this stage essentially speculative: this paper combines a series of observations on this topic, both theoretical and empirical, that could serve as a groundwork for further investigations.