No Arabic abstract
The increasing availability of data presents an opportunity to calibrate unknown parameters which appear in complex models of phenomena in the biomedical, physical and social sciences. However, model complexity often leads to parameter-to-data maps which are expensive to evaluate and are only available through noisy approximations. This paper is concerned with the use of interacting particle systems for the solution of the resulting inverse problems for parameters. Of particular interest is the case where the available forward model evaluations are subject to rapid fluctuations, in parameter space, superimposed on the smoothly varying large scale parametric structure of interest. Multiscale analysis is used to study the behaviour of interacting particle system algorithms when such rapid fluctuations, which we refer to as noise, pollute the large scale parametric dependence of the parameter-to-data map. Ensemble Kalman methods (which are derivative-free) and Langevin-based methods (which use the derivative of the parameter-to-data map) are compared in this light. The ensemble Kalman methods are shown to behave favourably in the presence of noise in the parameter-to-data map, whereas Langevin methods are adversely affected. On the other hand, Langevin methods have the correct equilibrium distribution in the setting of noise-free forward models, whilst ensemble Kalman methods only provide an uncontrolled approximation, except in the linear case. Therefore a new class of algorithms, ensemble Gaussian process samplers, which combine the benefits of both ensemble Kalman and Langevin methods, are introduced and shown to perform favourably.
We investigate the application of ensemble transform approaches to Bayesian inference of logistic regression problems. Our approach relies on appropriate extensions of the popular ensemble Kalman filter and the feedback particle filter to the cross entropy loss function and is based on a well-established homotopy approach to Bayesian inference. The arising finite particle evolution equations as well as their mean-field limits are affine-invariant. Furthermore, the proposed methods can be implemented in a gradient-free manner in case of nonlinear logistic regression and the data can be randomly subsampled similar to mini-batching of stochastic gradient descent. We also propose a closely related SDE-based sampling method which again is affine-invariant and can easily be made gradient-free. Numerical examples demonstrate the appropriateness of the proposed methodologies.
Partial differential equations (PDEs) are used, with huge success, to model phenomena arising across all scientific and engineering disciplines. However, across an equally wide swath, there exist situations in which PDE models fail to adequately model observed phenomena or are not the best available model for that purpose. On the other hand, in many situations, nonlocal models that account for interaction occurring at a distance have been shown to more faithfully and effectively model observed phenomena that involve possible singularities and other anomalies. In this article, we consider a generic nonlocal model, beginning with a short review of its definition, the properties of its solution, its mathematical analysis, and specific concrete examples. We then provide extensive discussions about numerical methods, including finite element, finite difference, and spectral methods, for determining approximate solutions of the nonlocal models considered. In that discussion, we pay particular attention to a special class of nonlocal models that are the most widely studied in the literature, namely those involving fractional derivatives. The article ends with brief considerations of several modeling and algorithmic extensions which serve to show the wide applicability of nonlocal modeling.
In many inference problems, the evaluation of complex and costly models is often required. In this context, Bayesian methods have become very popular in several fields over the last years, in order to obtain parameter inversion, model selection or uncertainty quantification. Bayesian inference requires the approximation of complicated integrals involving (often costly) posterior distributions. Generally, this approximation is obtained by means of Monte Carlo (MC) methods. In order to reduce the computational cost of the corresponding technique, surrogate models (also called emulators) are often employed. Another alternative approach is the so-called Approximate Bayesian Computation (ABC) scheme. ABC does not require the evaluation of the costly model but the ability to simulate artificial data according to that model. Moreover, in ABC, the choice of a suitable distance between real and artificial data is also required. In this work, we introduce a novel approach where the expensive model is evaluated only in some well-chosen samples. The selection of these nodes is based on the so-called compressed Monte Carlo (CMC) scheme. We provide theoretical results supporting the novel algorithms and give empirical evidence of the performance of the proposed method in several numerical experiments. Two of them are real-world applications in astronomy and satellite remote sensing.
In this paper, we introduce efficient ensemble Markov Chain Monte Carlo (MCMC) sampling methods for Bayesian computations in the univariate stochastic volatility model. We compare the performance of our ensemble MCMC methods with an improved version of a recent sampler of Kastner and Fruwirth-Schnatter (2014). We show that ensemble samplers are more efficient than this state of the art sampler by a factor of about 3.1, on a data set simulated from the stochastic volatility model. This performance gain is achieved without the ensemble MCMC sampler relying on the assumption that the latent process is linear and Gaussian, unlike the sampler of Kastner and Fruwirth-Schnatter.
We investigate the use of data-driven likelihoods to bypass a key assumption made in many scientific analyses, which is that the true likelihood of the data is Gaussian. In particular, we suggest using the optimization targets of flow-based generative models, a class of models that can capture complex distributions by transforming a simple base distribution through layers of nonlinearities. We call these flow-based likelihoods (FBL). We analyze the accuracy and precision of the reconstructed likelihoods on mock Gaussian data, and show that simply gauging the quality of samples drawn from the trained model is not a sufficient indicator that the true likelihood has been learned. We nevertheless demonstrate that the likelihood can be reconstructed to a precision equal to that of sampling error due to a finite sample size. We then apply FBLs to mock weak lensing convergence power spectra, a cosmological observable that is significantly non-Gaussian (NG). We find that the FBL captures the NG signatures in the data extremely well, while other commonly used data-driven likelihoods, such as Gaussian mixture models and independent component analysis, fail to do so. This suggests that works that have found small posterior shifts in NG data with data-driven likelihoods such as these could be underestimating the impact of non-Gaussianity in parameter constraints. By introducing a suite of tests that can capture different levels of NG in the data, we show that the success or failure of traditional data-driven likelihoods can be tied back to the structure of the NG in the data. Unlike other methods, the flexibility of the FBL makes it successful at tackling different types of NG simultaneously. Because of this, and consequently their likely applicability across datasets and domains, we encourage their use for inference when sufficient mock data are available for training.