أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Umberto Picchini

Sequential Neural Posterior and Likelihood Approximation

169 - Samuel Wiqvist , Jes Frellsen , Umberto Picchini 2021

We introduce the sequential neural posterior and likelihood approximation (SNPLA) algorithm. SNPLA is a normalizing flows-based algorithm for inference in implicit models, and therefore is a simulation-based inference method that only requires simula tions from a generative model. SNPLA avoids Markov chain Monte Carlo sampling and correction-steps of the parameter proposal function that are introduced in similar methods, but that can be numerically unstable or restrictive. By utilizing the reverse KL divergence, SNPLA manages to learn both the likelihood and the posterior in a sequential manner. Over four experiments, we show that SNPLA performs competitively when utilizing the same number of model simulations as used in other methods, even though the inference problem for SNPLA is more complex due to the joint learning of posterior and likelihood function. Due to utilizing normalizing flows SNPLA generates posterior draws much faster (4 orders of magnitude) than MCMC-based methods.

التعلم الالي التعلم الآلي المنهجية

Sequentially guided MCMC proposals for synthetic likelihoods and correlated synthetic likelihoods

156 - Umberto Picchini , Umberto Simola , Jukka Corander 2020

Synthetic likelihood (SL) is a strategy for parameter inference when the likelihood function is analytically or computationally intractable. In SL, the likelihood function of the data is replaced by a multivariate Gaussian density over summary statis tics of the data. SL requires simulation of many replicate datasets at every parameter value considered by a sampling algorithm, such as MCMC, making the method computationally-intensive. We propose two strategies to alleviate the computational burden imposed by SL algorithms. We first introduce a novel MCMC algorithm for SL where the proposal distribution is sequentially tuned and is also made conditional to data, thus it rapidly guides the proposed parameters towards high posterior probability regions. Second, we exploit strategies borrowed from the correlated pseudo-marginal MCMC literature, to improve the chains mixing in a SL framework. Our methods enable inference for challenging case studies when the chain is initialised in low posterior probability regions of the parameter space, where standard samplers failed. Our guided sampler can also be potentially used with MCMC samplers for approximate Bayesian computation (ABC). Our goal is to provide ways to make the best out of each expensive MCMC iteration, which will broaden the scope of likelihood-free inference for models with costly simulators. To illustrate the advantages stemming from our framework we consider four benchmark examples, including estimation of parameters for a cosmological model and a stochastic model with highly non-Gaussian summary statistics.

المنهجية

Stratified sampling and bootstrapping for approximate Bayesian computation

163 - Umberto Picchini , Richard G. Everitt 2019

Approximate Bayesian computation (ABC) is computationally intensive for complex model simulators. To exploit expensive simulations, data-resampling via bootstrapping can be employed to obtain many artificial datasets at little cost. However, when usi ng this approach within ABC, the posterior variance is inflated, thus resulting in biased posterior inference. Here we use stratified Monte Carlo to considerably reduce the bias induced by data resampling. We also show empirically that it is possible to obtain reliable inference using a larger than usual ABC threshold. Finally, we show that with stratified Monte Carlo we obtain a less variable ABC likelihood. Ultimately we show how our approach improves the computational efficiency of the ABC samplers. We construct several ABC samplers employing our methodology, such as rejection and importance ABC samplers, and ABC-MCMC samplers. We consider simulation studies for static (Gaussian, g-and-k distribution, Ising model, astronomical model) and dynamic models (Lotka-Volterra). We compare against state-of-art sequential Monte Carlo ABC samplers, synthetic likelihoods, and likelihood-free Bayesian optimization. For a computationally expensive Lotka-Volterra case study, we found that our strategy leads to a more than 10-fold computational saving, compared to a sampler that does not use our novel approach.

حساب المنهجية

Partially Exchangeable Networks and Architectures for Learning Summary Statistics in Approximate Bayesian Computation

136 - Samuel Wiqvist , Pierre-Alexandre Mattei , Umberto Picchini 2019

We present a novel family of deep neural architectures, named partially exchangeable networks (PENs) that leverage probabilistic symmetries. By design, PENs are invariant to block-switch transformations, which characterize the partial exchangeability properties of conditionally Markovian processes. Moreover, we show that any block-switch invariant function has a PEN-like representation. The DeepSets architecture is a special case of PEN and we can therefore also target fully exchangeable data. We employ PENs to learn summary statistics in approximate Bayesian computation (ABC). When comparing PENs to previous deep learning methods for learning summary statistics, our results are highly competitive, both considering time series and static models. Indeed, PENs provide more reliable posterior samples even when using less training data.

التعلم الالي التعلم الآلي حساب

Accelerating delayed-acceptance Markov chain Monte Carlo algorithms

69 - Samuel Wiqvist , Umberto Picchini , Julie Lyng Forman 2018

Delayed-acceptance Markov chain Monte Carlo (DA-MCMC) samples from a probability distribution via a two-stages version of the Metropolis-Hastings algorithm, by combining the target distribution with a surrogate (i.e. an approximate and computationall y cheaper version) of said distribution. DA-MCMC accelerates MCMC sampling in complex applications, while still targeting the exact distribution. We design a computationally faster, albeit approximate, DA-MCMC algorithm. We consider parameter inference in a Bayesian setting where a surrogate likelihood function is introduced in the delayed-acceptance scheme. When the evaluation of the likelihood function is computationally intensive, our scheme produces a 2-4 times speed-up, compared to standard DA-MCMC. However, the acceleration is highly problem dependent. Inference results for the standard delayed-acceptance algorithm and our approximated version are similar, indicating that our algorithm can return reliable Bayesian inference. As a computationally intensive case study, we introduce a novel stochastic differential equation model for protein folding data.

حساب

Likelihood-free stochastic approximation EM for inference in complex models

119 - Umberto Picchini 2016

A maximum likelihood methodology for the parameters of models with an intractable likelihood is introduced. We produce a likelihood-free version of the stochastic approximation expectation-maximization (SAEM) algorithm to maximize the likelihood func tion of model parameters. While SAEM is best suited for models having a tractable complete likelihood function, its application to moderately complex models is a difficult or even impossible task. We show how to construct a likelihood-free version of SAEM by using the synthetic likelihood paradigm. Our method is completely plug-and-play, requires almost no tuning and can be applied to both static and dynamic models. Four simulation studies illustrate the method, including a stochastic differential equation model, a stochastic Lotka-Volterra model and data from $g$-and-$k$ distributions. MATLAB code is available as supplementary material.

المنهجية حساب

Bayesian inference for stochastic differential equation mixed effects models of a tumor xenography study

168 - Umberto Picchini , Julie Lyng Forman 2016

We consider Bayesian inference for stochastic differential equation mixed effects models (SDEMEMs) exemplifying tumor response to treatment and regrowth in mice. We produce an extensive study on how a SDEMEM can be fitted using both exact inference b ased on pseudo-marginal MCMC and approximate inference via Bayesian synthetic likelihoods (BSL). We investigate a two-compartments SDEMEM, these corresponding to the fractions of tumor cells killed by and survived to a treatment, respectively. Case study data considers a tumor xenography study with two treatment groups and one control, each containing 5-8 mice. Results from the case study and from simulations indicate that the SDEMEM is able to reproduce the observed growth patterns and that BSL is a robust tool for inference in SDEMEMs. Finally, we compare the fit of the SDEMEM to a similar ordinary differential equation model. Due to small sample sizes, strong prior information is needed to identify all model parameters in the SDEMEM and it cannot be determined which of the two models is the better in terms of predicting tumor growth curves. In a simulation study we find that with a sample of 17 mice per group BSL is able to identify all model parameters and distinguish treatment groups.

تطبيقات الإحصاء حساب المنهجية

Coupling stochastic EM and Approximate Bayesian Computation for parameter inference in state-space models

116 - Umberto Picchini , Adeline Samson 2015

We study the class of state-space models and perform maximum likelihood estimation for the model parameters. We consider a stochastic approximation expectation-maximization (SAEM) algorithm to maximize the likelihood function with the novelty of usin g approximate Bayesian computation (ABC) within SAEM. The task is to provide each iteration of SAEM with a filtered state of the system, and this is achieved using an ABC sampler for the hidden state, based on sequential Monte Carlo (SMC) methodology. It is shown that the resulting SAEM-ABC algorithm can be calibrated to return accurate inference, and in some situations it can outperform a version of SAEM incorporating the bootstrap filter. Two simulation studies are presented, first a nonlinear Gaussian state-space model then a state-space model having dynamics expressed by a stochastic differential equation. Comparisons with iterated filtering for maximum likelihood inference, and Gibbs sampling and particle marginal methods for Bayesian inference are presented.

حساب المنهجية

Approximate maximum likelihood estimation using data-cloning ABC

128 - Umberto Picchini , Rachele Anderson 2015

A maximum likelihood methodology for a general class of models is presented, using an approximate Bayesian computation (ABC) approach. The typical target of ABC methods are models with intractable likelihoods, and we combine an ABC-MCMC sampler with so-called data cloning for maximum likelihood estimation. Accuracy of ABC methods relies on the use of a small threshold value for comparing simulations from the model and observed data. The proposed methodology shows how to use large threshold values, while the number of data-clones is increased to ease convergence towards an approximate maximum likelihood estimate. We show how to exploit the methodology to reduce the number of iterations of a standard ABC-MCMC algorithm and therefore reduce the computational effort, while obtaining reasonable point estimates. Simulation studies show the good performance of our approach on models with intractable likelihoods such as g-and-k distributions, stochastic differential equations and state-space models.

المنهجية حساب

Accelerating inference for diffusions observed with measurement error and large sample sizes using Approximate Bayesian Computation

123 - Umberto Picchini , Julie Lyng Forman 2013

In recent years dynamical modelling has been provided with a range of breakthrough methods to perform exact Bayesian inference. However it is often computationally unfeasible to apply exact statistical methodologies in the context of large datasets a nd complex models. This paper considers a nonlinear stochastic differential equation model observed with correlated measurement errors and an application to protein folding modelling. An Approximate Bayesian Computation (ABC) MCMC algorithm is suggested to allow inference for model parameters within reasonable time constraints. The ABC algorithm uses simulations of subsamples from the assumed data generating model as well as a so-called early rejection strategy to speed up computations in the ABC-MCMC sampler. Using a considerate amount of subsamples does not seem to degrade the quality of the inferential results for the considered applications. A simulation study is conducted to compare our strategy with exact Bayesian inference, the latter resulting two orders of magnitude slower than ABC-MCMC for the considered setup. Finally the ABC algorithm is applied to a large size protein data. The suggested methodology is fairly general and not limited to the exemplified model and data.

حساب تطبيقات الإحصاء

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد