No Arabic abstract
Gaussian latent variable models are a key class of Bayesian hierarchical models with applications in many fields. Performing Bayesian inference on such models can be challenging as Markov chain Monte Carlo algorithms struggle with the geometry of the resulting posterior distribution and can be prohibitively slow. An alternative is to use a Laplace approximation to marginalize out the latent Gaussian variables and then integrate out the remaining hyperparameters using dynamic Hamiltonian Monte Carlo, a gradient-based Markov chain Monte Carlo sampler. To implement this scheme efficiently, we derive a novel adjoint method that propagates the minimal information needed to construct the gradient of the approximate marginal likelihood. This strategy yields a scalable differentiation method that is orders of magnitude faster than state of the art differentiation techniques when the hyperparameters are high dimensional. We prototype the method in the probabilistic programming framework Stan and test the utility of the embedded Laplace approximation on several models, including one where the dimension of the hyperparameter is $sim$6,000. Depending on the cases, the benefits can include an alleviation of the geometric pathologies that frustrate Hamiltonian Monte Carlo and a dramatic speed-up.
Dynamically rescaled Hamiltonian Monte Carlo (DRHMC) is introduced as a computationally fast and easily implemented method for performing full Bayesian analysis in hierarchical statistical models. The method relies on introducing a modified parameterisation so that the re-parameterised target distribution has close to constant scaling properties, and thus is easily sampled using standard (Euclidian metric) Hamiltonian Monte Carlo. Provided that the parameterisations of the conditional distributions specifying the hierarchical model are constant information parameterisations (CIP), the relation between the modified- and original parameterisation is bijective, explicitly computed and admit exploitation of sparsity in the numerical linear algebra involved. CIPs for a large catalogue of statistical models are presented, and from the catalogue, it is clear that many CIPs are currently routinely used in statistical computing. A relation between the proposed methodology and a class of explicitly integrated Riemann manifold Hamiltonian Monte Carlo methods is discussed. The methodology is illustrated on several example models, including a model for inflation rates with multiple levels of non-linearly dependent latent variables.
Deep Gaussian Processes (DGPs) are hierarchical generalizations of Gaussian Processes that combine well calibrated uncertainty estimates with the high flexibility of multilayer models. One of the biggest challenges with these models is that exact inference is intractable. The current state-of-the-art inference method, Variational Inference (VI), employs a Gaussian approximation to the posterior distribution. This can be a potentially poor unimodal approximation of the generally multimodal posterior. In this work, we provide evidence for the non-Gaussian nature of the posterior and we apply the Stochastic Gradient Hamiltonian Monte Carlo method to generate samples. To efficiently optimize the hyperparameters, we introduce the Moving Window MCEM algorithm. This results in significantly better predictions at a lower computational cost than its VI counterpart. Thus our method establishes a new state-of-the-art for inference in DGPs.
In this paper, we develop Bayesian Hamiltonian Monte Carlo methods for inference in asymmetric GARCH models under different distributions for the error term. We implemented Zero-variance and Hamiltonian Monte Carlo schemes for parameter estimation to try and reduce the standard errors of the estimates thus obtaing more efficient results at the price of a small extra computational cost.
Bayesian inference for nonlinear diffusions, observed at discrete times, is a challenging task that has prompted the development of a number of algorithms, mainly within the computational statistics community. We propose a new direction, and accompanying methodology, borrowing ideas from statistical physics and computational chemistry, for inferring the posterior distribution of latent diffusion paths and model parameters, given observations of the process. Joint configurations of the underlying process noise and of parameters, mapping onto diffusion paths consistent with observations, form an implicitly defined manifold. Then, by making use of a constrained Hamiltonian Monte Carlo algorithm on the embedded manifold, we are able to perform computationally efficient inference for an extensive class of discretely observed diffusion models. Critically, in contrast with other approaches proposed in the literature, our methodology is highly automated, requiring minimal user intervention and applying alike in a range of settings, including: elliptic or hypo-elliptic systems; observations with or without noise; linear or non-linear observation operators. Exploiting Markovianity, we propose a variant of the method with complexity that scales linearly in the resolution of path discretisation and the number of observation times.
The coalescence of binary neutron stars are one of the main sources of gravitational waves for ground-based gravitational wave detectors. As Bayesian inference for binary neutron stars is computationally expensive, more efficient and faster converging algorithms are always needed. In this work, we conduct a feasibility study using a Hamiltonian Monte Carlo algorithm (HMC). The HMC is a sampling algorithm that takes advantage of gradient information from the geometry of the parameter space to efficiently sample from the posterior distribution, allowing the algorithm to avoid the random-walk behaviour commonly associated with stochastic samplers. As well as tuning the algorithms free parameters specifically for gravitational wave astronomy, we introduce a method for approximating the gradients of the log-likelihood that reduces the runtime for a $10^6$ trajectory run from ten weeks, using numerical derivatives along the Hamiltonian trajectories, to one day, in the case of non-spinning neutron stars. Testing our algorithm against a set of neutron star binaries using a detector network composed of Advanced LIGO and Advanced Virgo at optimal design, we demonstrate that not only is our algorithm more efficient than a standard sampler, but a $10^6$ trajectory HMC produces an effective sample size on the order of $10^4 - 10^5$ statistically independent samples.