No Arabic abstract
We present a non-trivial integration of dimension-independent likelihood-informed (DILI) MCMC (Cui, Law, Marzouk, 2016) and the multilevel MCMC (Dodwell et al., 2015) to explore the hierarchy of posterior distributions. This integration offers several advantages: First, DILI-MCMC employs an intrinsic likelihood-informed subspace (LIS) (Cui et al., 2014) -- which involves a number of forward and adjoint model simulations -- to design accelerated operator-weighted proposals. By exploiting the multilevel structure of the discretised parameters and discretised forward models, we design a Rayleigh-Ritz procedure to significantly reduce the computational effort in building the LIS and operating with DILI proposals. Second, the resulting DILI-MCMC can drastically improve the sampling efficiency of MCMC at each level, and hence reduce the integration error of the multilevel algorithm for fixed CPU time. To be able to fully exploit the power of multilevel MCMC and to reduce the dependencies of samples on different levels for a parallel implementation, we also suggest a new pooling strategy for allocating computational resources across different levels and constructing Markov chains at higher levels conditioned on those simulated on lower levels. Numerical results confirm the improved computational efficiency of the multilevel DILI approach.
Deterministic interpolation and quadrature methods are often unsuitable to address Bayesian inverse problems depending on computationally expensive forward mathematical models. While interpolation may give precise posterior approximations, deterministic quadrature is usually unable to efficiently investigate an informative and thus concentrated likelihood. This leads to a large number of required expensive evaluations of the mathematical model. To overcome these challenges, we formulate and test a multilevel adaptive sparse Leja algorithm. At each level, adaptive sparse grid interpolation and quadrature are used to approximate the posterior and perform all quadrature operations, respectively. Specifically, our algorithm uses coarse discretizations of the underlying mathematical model to investigate the parameter space and to identify areas of high posterior probability. Adaptive sparse grid algorithms are then used to place points in these areas, and ignore other areas of small posterior probability. The points are weighted Leja points. As the model discretization is coarse, the construction of the sparse grid is computationally efficient. On this sparse grid, the posterior measure can be approximated accurately with few expensive, fine model discretizations. The efficiency of the algorithm can be enhanced further by exploiting more than two discretization levels. We apply the proposed multilevel adaptive sparse Leja algorithm in numerical experiments involving elliptic inverse problems in 2D and 3D space, in which we compare it with Markov chain Monte Carlo sampling and a standard multilevel approximation.
The methodology developed in this article is motivated by a wide range of prediction and uncertainty quantification problems that arise in Statistics, Machine Learning and Applied Mathematics, such as non-parametric regression, multi-class classification and inversion of partial differential equations. One popular formulation of such problems is as Bayesian inverse problems, where a prior distribution is used to regularize inference on a high-dimensional latent state, typically a function or a field. It is common that such priors are non-Gaussian, for example piecewise-constant or heavy-tailed, and/or hierarchical, in the sense of involving a further set of low-dimensional parameters, which, for example, control the scale or smoothness of the latent state. In this formulation prediction and uncertainty quantification relies on efficient exploration of the posterior distribution of latent states and parameters. This article introduces a framework for efficient MCMC sampling in Bayesian inverse problems that capitalizes upon two fundamental ideas in MCMC, non-centred parameterisations of hierarchical models and dimension-robust samplers for latent Gaussian processes. Using a range of diverse applications we showcase that the proposed framework is dimension-robust, that is, the efficiency of the MCMC sampling does not deteriorate as the dimension of the latent state gets higher. We showcase the full potential of the machinery we develop in the article in semi-supervised multi-class classification, where our sampling algorithm is used within an active learning framework to guide the selection of input data to manually label in order to achieve high predictive accuracy with a minimal number of labelled data.
This position paper summarizes a recently developed research program focused on inference in the context of data centric science and engineering applications, and forecasts its trajectory forward over the next decade. Often one endeavours in this context to learn complex systems in order to make more informed predictions and high stakes decisions under uncertainty. Some key challenges which must be met in this context are robustness, generalizability, and interpretability. The Bayesian framework addresses these three challenges, while bringing with it a fourth, undesirable feature: it is typically far more expensive than its deterministic counterparts. In the 21st century, and increasingly over the past decade, a growing number of methods have emerged which allow one to leverage cheap low-fidelity models in order to precondition algorithms for performing inference with more expensive models and make Bayesian inference tractable in the context of high-dimensional and expensive models. Notable examples are multilevel Monte Carlo (MLMC), multi-index Monte Carlo (MIMC), and their randomized counterparts (rMLMC), which are able to provably achieve a dimension-independent (including $infty-$dimension) canonical complexity rate with respect to mean squared error (MSE) of $1/$MSE. Some parallelizability is typically lost in an inference context, but recently this has been largely recovered via novel double randomization approaches. Such an approach delivers i.i.d. samples of quantities of interest which are unbiased with respect to the infinite resolution target distribution. Over the coming decade, this family of algorithms has the potential to transform data centric science and engineering, as well as classical machine learning applications such as deep learning, by scaling up and scaling out fully Bayesian inference.
Markov Chain Monte Carlo methods become increasingly popular in applied mathematics as a tool for numerical integration with respect to complex and high-dimensional distributions. However, application of MCMC methods to heavy tailed distributions and distributions with analytically intractable densities turns out to be rather problematic. In this paper, we propose a novel approach towards the use of MCMC algorithms for distributions with analytically known Fourier transforms and, in particular, heavy tailed distributions. The main idea of the proposed approach is to use MCMC methods in Fourier domain to sample from a density proportional to the absolute value of the underlying characteristic function. A subsequent application of the Parsevals formula leads to an efficient algorithm for the computation of integrals with respect to the underlying density. We show that the resulting Markov chain in Fourier domain may be geometrically ergodic even in the case of heavy tailed original distributions. We illustrate our approach by several numerical examples including multivariate elliptically contoured stable distributions.
Stochastic gradient Markov chain Monte Carlo (MCMC) algorithms have received much attention in Bayesian computing for big data problems, but they are only applicable to a small class of problems for which the parameter space has a fixed dimension and the log-posterior density is differentiable with respect to the parameters. This paper proposes an extended stochastic gradient MCMC lgoriathm which, by introducing appropriate latent variables, can be applied to more general large-scale Bayesian computing problems, such as those involving dimension jumping and missing data. Numerical studies show that the proposed algorithm is highly scalable and much more efficient than traditional MCMC algorithms. The proposed algorithms have much alleviated the pain of Bayesian methods in big data computing.