We present a method for performing Hamiltonian Monte Carlo that largely eliminates sample rejection for typical hyperparameters. In situations that would normally lead to rejection, instead a longer trajectory is computed until a new state is reached that can be accepted. This is achieved using Markov chain transitions that satisfy the fixed point equation, but do not satisfy detailed balance. The resulting algorithm significantly suppresses the random walk behavior and wasted function evaluations that are typically the consequence of update rejection. We demonstrate a greater than factor of two improvement in mixing time on three test problems. We release the source code as Python and MATLAB packages.
Hamiltonian Monte Carlo is a powerful algorithm for sampling from difficult-to-normalize posterior distributions. However, when the geometry of the posterior is unfavorable, it may take many expensive evaluations of the target distribution and its gradient to converge and mix. We propose neural transport (NeuTra) HMC, a technique for learning to correct this sort of unfavorable geometry using inverse autoregressive flows (IAF), a powerful neural variational inference technique. The IAF is trained to minimize the KL divergence from an isotropic Gaussian to the warped posterior, and then HMC sampling is performed in the warped space. We evaluate NeuTra HMC on a variety of synthetic and real problems, and find that it significantly outperforms vanilla HMC both in time to reach the stationary distribution and asymptotic effective-sample-size rates.
Markov Chain Monte Carlo (MCMC) methods are employed to sample from a given distribution of interest, whenever either the distribution does not exist in closed form, or, if it does, no efficient method to simulate an independent sample from it is available. Although a wealth of diagnostic tools for convergence assessment of MCMC methods have been proposed in the last two decades, the search for a dependable and easy to implement tool is ongoing. We present in this article a criterion based on the principle of detailed balance which provides a qualitative assessment of the convergence of a given chain. The criterion is based on the behaviour of a one-dimensional statistic, whose asymptotic distribution under the assumption of stationarity is derived; our results apply under weak conditions and have the advantage of being completely intuitive. We implement this criterion as a stopping rule for simulated annealing in the problem of finding maximum likelihood estimators for parameters of a 20-component mixture model. We also apply it to the problem of sampling from a 10-dimensional funnel distribution via slice sampling and the Metropolis-Hastings algorithm. Furthermore, based on this convergence criterion we define a measure of efficiency of one algorithm versus another.
Hamiltonian Monte Carlo (HMC) has been widely adopted in the statistics community because of its ability to sample high-dimensional distributions much more efficiently than other Metropolis-based methods. Despite this, HMC often performs sub-optimally on distributions with high correlations or marginal variances on multiple scales because the resulting stiffness forces the leapfrog integrator in HMC to take an unreasonably small stepsize. We provide intuition as well as a formal analysis showing how these multiscale distributions limit the stepsize of leapfrog and we show how the implicit midpoint method can be used, together with Newton-Krylov iteration, to circumvent this limitation and achieve major efficiency gains. Furthermore, we offer practical guidelines for when to choose between implicit midpoint and leapfrog and what stepsize to use for each method, depending on the distribution being sampled. Unlike previous modifications to HMC, our method is generally applicable to highly non-Gaussian distributions exhibiting multiple scales. We illustrate how our method can provide a dramatic speedup over leapfrog in the context of the No-U-Turn sampler (NUTS) applied to several examples.
Continuous time Hamiltonian Monte Carlo is introduced, as a powerful alternative to Markov chain Monte Carlo methods for continuous target distributions. The method is constructed in two steps: First Hamiltonian dynamics are chosen as the deterministic dynamics in a continuous time piecewise deterministic Markov process. Under very mild restrictions, such a process will have the desired target distribution as an invariant distribution. Secondly, the numerical implementation of such processes, based on adaptive numerical integration of second order ordinary differential equations is considered. The numerical implementation yields an approximate, yet highly robust algorithm that, unlike conventional Hamiltonian Monte Carlo, enables the exploitation of the complete Hamiltonian trajectories (hence the title). The proposed algorithm may yield large speedups and improvements in stability relative to relevant benchmarks, while incurring numerical errors that are negligible relative to the overall Monte Carlo errors.