No Arabic abstract
Multifidelity methods are widely used for statistical estimation of quantities of interest (QoIs) in uncertainty quantification using simulation codes of differing costs and accuracies. Many methods approximate numerical-valued statistics that represent only limited information of the QoIs. In this paper, we introduce a semi-parametric approach that aims to effectively describe the distribution of a scalar-valued QoI in the multifidelity setup. Under a linear model hypothesis, we propose an exploration-exploitation strategy to reconstruct the full distribution of a scalar-valued QoI using samples from a subset of low-fidelity regressors. We derive an informative asymptotic bound for the mean 1-Wasserstein distance between the estimator and the true distribution, and use it to adaptively allocate computational budget for parametric estimation and non-parametric reconstruction. Assuming the linear model is correct, we prove that such a procedure is consistent, and converges to the optimal policy (and hence optimal computational budget allocation) under an upper bound criterion as the budget goes to infinity. A major advantage of our approach compared to several other multifidelity methods is that it is automatic, and its implementation does not require a hierarchical model setup, cross-model information, or textit{a priori} known model statistics. Numerical experiments are provided in the end to support our theoretical analysis.
Deterministic interpolation and quadrature methods are often unsuitable to address Bayesian inverse problems depending on computationally expensive forward mathematical models. While interpolation may give precise posterior approximations, deterministic quadrature is usually unable to efficiently investigate an informative and thus concentrated likelihood. This leads to a large number of required expensive evaluations of the mathematical model. To overcome these challenges, we formulate and test a multilevel adaptive sparse Leja algorithm. At each level, adaptive sparse grid interpolation and quadrature are used to approximate the posterior and perform all quadrature operations, respectively. Specifically, our algorithm uses coarse discretizations of the underlying mathematical model to investigate the parameter space and to identify areas of high posterior probability. Adaptive sparse grid algorithms are then used to place points in these areas, and ignore other areas of small posterior probability. The points are weighted Leja points. As the model discretization is coarse, the construction of the sparse grid is computationally efficient. On this sparse grid, the posterior measure can be approximated accurately with few expensive, fine model discretizations. The efficiency of the algorithm can be enhanced further by exploiting more than two discretization levels. We apply the proposed multilevel adaptive sparse Leja algorithm in numerical experiments involving elliptic inverse problems in 2D and 3D space, in which we compare it with Markov chain Monte Carlo sampling and a standard multilevel approximation.
We present a non-trivial integration of dimension-independent likelihood-informed (DILI) MCMC (Cui, Law, Marzouk, 2016) and the multilevel MCMC (Dodwell et al., 2015) to explore the hierarchy of posterior distributions. This integration offers several advantages: First, DILI-MCMC employs an intrinsic likelihood-informed subspace (LIS) (Cui et al., 2014) -- which involves a number of forward and adjoint model simulations -- to design accelerated operator-weighted proposals. By exploiting the multilevel structure of the discretised parameters and discretised forward models, we design a Rayleigh-Ritz procedure to significantly reduce the computational effort in building the LIS and operating with DILI proposals. Second, the resulting DILI-MCMC can drastically improve the sampling efficiency of MCMC at each level, and hence reduce the integration error of the multilevel algorithm for fixed CPU time. To be able to fully exploit the power of multilevel MCMC and to reduce the dependencies of samples on different levels for a parallel implementation, we also suggest a new pooling strategy for allocating computational resources across different levels and constructing Markov chains at higher levels conditioned on those simulated on lower levels. Numerical results confirm the improved computational efficiency of the multilevel DILI approach.
Multifidelity approximation is an important technique in scientific computation and simulation. In this paper, we introduce a bandit-learning approach for leveraging data of varying fidelities to achieve precise estimates of the parameters of interest. Under a linear model assumption, we formulate a multifidelity approximation as a modified stochastic bandit, and analyze the loss for a class of policies that uniformly explore each model before exploiting. Utilizing the estimated conditional mean-squared error, we propose a consistent algorithm, adaptive Explore-Then-Commit (AETC), and establish a corresponding trajectory-wise optimality result. These results are then extended to the case of vector-valued responses, where we demonstrate that the algorithm is efficient without the need to worry about estimating high-dimensional parameters. The main advantage of our approach is that we require neither hierarchical model structure nor textit{a priori} knowledge of statistical information (e.g., correlations) about or between models. Instead, the AETC algorithm requires only knowledge of which model is a trusted high-fidelity model, along with (relative) computational cost estimates of querying each model. Numerical experiments are provided at the end to support our theoretical findings.
Accurate approximation of scalar-valued functions from sample points is a key task in computational science. Recently, machine learning with Deep Neural Networks (DNNs) has emerged as a promising tool for scientific computing, with impressive results achieved on problems where the dimension of the data or problem domain is large. This work broadens this perspective, focusing on approximating functions that are Hilbert-valued, i.e. take values in a separable, but typically infinite-dimensional, Hilbert space. This arises in science and engineering problems, in particular those involving solution of parametric Partial Differential Equations (PDEs). Such problems are challenging: 1) pointwise samples are expensive to acquire, 2) the function domain is high dimensional, and 3) the range lies in a Hilbert space. Our contributions are twofold. First, we present a novel result on DNN training for holomorphic functions with so-called hidden anisotropy. This result introduces a DNN training procedure and full theoretical analysis with explicit guarantees on error and sample complexity. The error bound is explicit in three key errors occurring in the approximation procedure: the best approximation, measurement, and physical discretization errors. Our result shows that there exists a procedure (albeit non-standard) for learning Hilbert-valued functions via DNNs that performs as well as, but no better than current best-in-class schemes. It gives a benchmark lower bound for how well DNNs can perform on such problems. Second, we examine whether better performance can be achieved in practice through different types of architectures and training. We provide preliminary numerical results illustrating practical performance of DNNs on parametric PDEs. We consider different parameters, modifying the DNN architecture to achieve better and competitive results, comparing these to current best-in-class schemes.
Optimization-based samplers such as randomize-then-optimize (RTO) [2] provide an efficient and parallellizable approach to solving large-scale Bayesian inverse problems. These methods solve randomly perturbed optimization problems to draw samples from an approximate posterior distribution. Correcting these samples, either by Metropolization or importance sampling, enables characterization of the original posterior distribution. This paper focuses on the scalability of RTO to problems with high- or infinite-dimensional parameters. We introduce a new subspace acceleration strategy that makes the computational complexity of RTO scale linearly with the parameter dimension. This subspace perspective suggests a natural extension of RTO to a function space setting. We thus formalize a function space version of RTO and establish sufficient conditions for it to produce a valid Metropolis-Hastings proposal, yielding dimension-independent sampling performance. Numerical examples corroborate the dimension-independence of RTO and demonstrate sampling performance that is also robust to small observational noise.