No Arabic abstract
In the design of efficient simulation algorithms, one is often beset with a poor choice of proposal distributions. Although the performance of a given simulation kernel can clarify a posteriori how adequate this kernel is for the problem at hand, a permanent on-line modification of kernels causes concerns about the validity of the resulting algorithm. While the issue is most often intractable for MCMC algorithms, the equivalent version for importance sampling algorithms can be validated quite precisely. We derive sufficient convergence conditions for adaptive mixtures of population Monte Carlo algorithms and show that Rao--Blackwelliz
Among Monte Carlo techniques, the importance sampling requires fine tuning of a proposal distribution, which is now fluently resolved through iterative schemes. The Adaptive Multiple Importance Sampling (AMIS) of Cornuet et al. (2012) provides a significant improvement in stability and effective sample size due to the introduction of a recycling procedure. However, the consistency of the AMIS estimator remains largely open. In this work we prove the convergence of the AMIS, at a cost of a slight modification in the learning process. Contrary to Douc et al. (2007a), results are obtained here in the asymptotic regime where the number of iterations is going to infinity while the number of drawings per iteration is a fixed, but growing sequence of integers. Hence some of the results shed new light on adaptive population Monte Carlo algorithms in that last regime.
The naive importance sampling estimator, based on samples from a single importance density, can be numerically unstable. Instead, we consider generalized importance sampling estimators where samples from more than one probability distribution are combined. We study this problem in the Markov chain Monte Carlo context, where independent samples are replaced with Markov chain samples. If the chains converge to their respective target distributions at a polynomial rate, then under two finite moment conditions, we show a central limit theorem holds for the generalized estimators. Further, we develop an easy to implement method to calculate valid asymptotic standard errors based on batch means. We also provide a batch means estimator for calculating asymptotically valid standard errors of Geyer(1994) reverse logistic estimator. We illustrate the method using a Bayesian variable selection procedure in linear regression. In particular, the generalized importance sampling estimator is used to perform empirical Bayes variable selection and the batch means estimator is used to obtain standard errors in a high-dimensional setting where current methods are not applicable.
The emergence of big data has led to a growing interest in so-called convergence complexity analysis, which is the study of how the convergence rate of a Monte Carlo Markov chain (for an intractable Bayesian posterior distribution) scales as the underlying data set grows in size. Convergence complexity analysis of practical Monte Carlo Markov chains on continuous state spaces is quite challenging, and there have been very few successful analyses of such chains. One fruitful analysis was recently presented by Qin and Hobert (2021b), who studied a Gibbs sampler for a simple Bayesian random effects model. These authors showed that, under regularity conditions, the geometric convergence rate of this Gibbs sampler converges to zero as the data set grows in size. It is shown herein that similar behavior is exhibited by Gibbs samplers for more general Bayesian models that possess both random effects and traditional continuous covariates, the so-called mixed models. The analysis employs the Wasserstein-based techniques introduced by Qin and Hobert (2021b).
We consider a generalization of the discrete-time Self Healing Umbrella Sampling method, which is an adaptive importance technique useful to sample multimodal target distributions. The importance function is based on the weights (namely the relative probabilities) of disjoint sets which form a partition of the space. These weights are unknown but are learnt on the fly yielding an adaptive algorithm. In the context of computational statistical physics, the logarithm of these weights is, up to a multiplicative constant, the free energy, and the discrete valued function defining the partition is called the collective variable. The algorithm falls into the general class of Wang-Landau type methods, and is a generalization of the original Self Healing Umbrella Sampling method in two ways: (i) the updating strategy leads to a larger penalization strength of already visited sets in order to escape more quickly from metastable states, and (ii) the target distribution is biased using only a fraction of the free energy, in order to increase the effective sample size and reduce the variance of importance sampling estimators. The algorithm can also be seen as a generalization of well-tempered metadynamics. We prove the convergence of the algorithm and analyze numerically its efficiency on a toy example.
The Adaptive Multiple Importance Sampling (AMIS) algorithm is aimed at an optimal recycling of past simulations in an iterated importance sampling scheme. The difference with earlier adaptive importance sampling implementations like Population Monte Carlo is that the importance weights of all simulated values, past as well as present, are recomputed at each iteration, following the technique of the deterministic multiple mixture estimator of Owen and Zhou (2000). Although the convergence properties of the algorithm cannot be fully investigated, we demonstrate through a challenging banana shape target distribution and a population genetics example that the improvement brought by this technique is substantial.