No Arabic abstract
Riemann manifold Hamiltonian Monte Carlo (RMHMC) has the potential to produce high-quality Markov chain Monte Carlo-output even for very challenging target distributions. To this end, a symmetric positive definite scaling matrix for RMHMC, which derives, via a modified Cholesky factorization, from the potentially indefinite negative Hessian of the target log-density is proposed. The methodology is able to exploit the sparsity of the Hessian, stemming from conditional independence modeling assumptions, and thus admit fast implementation of RMHMC even for high-dimensional target distributions. Moreover, the methodology can exploit log-concave conditional target densities, often encountered in Bayesian hierarchical models, for faster sampling and more straight forward tuning. The proposed methodology is compared to alternatives for some challenging targets, and is illustrated by applying a state space model to real data.
Hamiltonian Monte Carlo (HMC) has been widely adopted in the statistics community because of its ability to sample high-dimensional distributions much more efficiently than other Metropolis-based methods. Despite this, HMC often performs sub-optimally on distributions with high correlations or marginal variances on multiple scales because the resulting stiffness forces the leapfrog integrator in HMC to take an unreasonably small stepsize. We provide intuition as well as a formal analysis showing how these multiscale distributions limit the stepsize of leapfrog and we show how the implicit midpoint method can be used, together with Newton-Krylov iteration, to circumvent this limitation and achieve major efficiency gains. Furthermore, we offer practical guidelines for when to choose between implicit midpoint and leapfrog and what stepsize to use for each method, depending on the distribution being sampled. Unlike previous modifications to HMC, our method is generally applicable to highly non-Gaussian distributions exhibiting multiple scales. We illustrate how our method can provide a dramatic speedup over leapfrog in the context of the No-U-Turn sampler (NUTS) applied to several examples.
Dynamically rescaled Hamiltonian Monte Carlo (DRHMC) is introduced as a computationally fast and easily implemented method for performing full Bayesian analysis in hierarchical statistical models. The method relies on introducing a modified parameterisation so that the re-parameterised target distribution has close to constant scaling properties, and thus is easily sampled using standard (Euclidian metric) Hamiltonian Monte Carlo. Provided that the parameterisations of the conditional distributions specifying the hierarchical model are constant information parameterisations (CIP), the relation between the modified- and original parameterisation is bijective, explicitly computed and admit exploitation of sparsity in the numerical linear algebra involved. CIPs for a large catalogue of statistical models are presented, and from the catalogue, it is clear that many CIPs are currently routinely used in statistical computing. A relation between the proposed methodology and a class of explicitly integrated Riemann manifold Hamiltonian Monte Carlo methods is discussed. The methodology is illustrated on several example models, including a model for inflation rates with multiple levels of non-linearly dependent latent variables.
Continuous time Hamiltonian Monte Carlo is introduced, as a powerful alternative to Markov chain Monte Carlo methods for continuous target distributions. The method is constructed in two steps: First Hamiltonian dynamics are chosen as the deterministic dynamics in a continuous time piecewise deterministic Markov process. Under very mild restrictions, such a process will have the desired target distribution as an invariant distribution. Secondly, the numerical implementation of such processes, based on adaptive numerical integration of second order ordinary differential equations is considered. The numerical implementation yields an approximate, yet highly robust algorithm that, unlike conventional Hamiltonian Monte Carlo, enables the exploitation of the complete Hamiltonian trajectories (hence the title). The proposed algorithm may yield large speedups and improvements in stability relative to relevant benchmarks, while incurring numerical errors that are negligible relative to the overall Monte Carlo errors.
We explore the construction of new symplectic numerical integration schemes to be used in Hamiltonian Monte Carlo and study their efficiency. Two integration schemes from Blanes et al. (2014), and a new scheme based on optimal acceptance probability, are considered as candidates to the commonly used leapfrog method. All integration schemes are tested within the framework of the No-U-Turn sampler (NUTS), both for a logistic regression model and a student $t$-model. The results show that the leapfrog method is inferior to all the new methods both in terms of asymptotic expected acceptance probability for a model problem and the and efficient sample size per computing time for the realistic models.
Bayesian inference for nonlinear diffusions, observed at discrete times, is a challenging task that has prompted the development of a number of algorithms, mainly within the computational statistics community. We propose a new direction, and accompanying methodology, borrowing ideas from statistical physics and computational chemistry, for inferring the posterior distribution of latent diffusion paths and model parameters, given observations of the process. Joint configurations of the underlying process noise and of parameters, mapping onto diffusion paths consistent with observations, form an implicitly defined manifold. Then, by making use of a constrained Hamiltonian Monte Carlo algorithm on the embedded manifold, we are able to perform computationally efficient inference for an extensive class of discretely observed diffusion models. Critically, in contrast with other approaches proposed in the literature, our methodology is highly automated, requiring minimal user intervention and applying alike in a range of settings, including: elliptic or hypo-elliptic systems; observations with or without noise; linear or non-linear observation operators. Exploiting Markovianity, we propose a variant of the method with complexity that scales linearly in the resolution of path discretisation and the number of observation times.