Optimizing Threshold - Schedules for Approximate Bayesian Computation Sequential Monte Carlo Samplers: Applications to Molecular Systems


Abstract in English

The likelihood-free sequential Approximate Bayesian Computation (ABC) algorithms, are increasingly popular inference tools for complex biological models. Such algorithms proceed by constructing a succession of probability distributions over the parameter space conditional upon the simulated data lying in an $epsilon$--ball around the observed data, for decreasing values of the threshold $epsilon$. While in theory, the distributions (starting from a suitably defined prior) will converge towards the unknown posterior as $epsilon$ tends to zero, the exact sequence of thresholds can impact upon the computational efficiency and success of a particular application. In particular, we show here that the current preferred method of choosing thresholds as a pre-determined quantile of the distances between simulated and observed data from the previous population, can lead to the inferred posterior distribution being very different to the true posterior. Threshold selection thus remains an important challenge. Here we propose an automated and adaptive method that allows us to balance the need to minimise the threshold with computational efficiency. Moreover, our method which centres around predicting the threshold - acceptance rate curve using the unscented transform, enables us to avoid local minima - a problem that has plagued previous threshold schemes.

Download