Partially Exchangeable Networks and Architectures for Learning Summary Statistics in Approximate Bayesian Computation

137 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Samuel Wiqvist

تاريخ النشر 2019

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Samuel Wiqvist - Pierre-Alexandre Mattei - Umberto Picchini

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We present a novel family of deep neural architectures, named partially exchangeable networks (PENs) that leverage probabilistic symmetries. By design, PENs are invariant to block-switch transformations, which characterize the partial exchangeability properties of conditionally Markovian processes. Moreover, we show that any block-switch invariant function has a PEN-like representation. The DeepSets architecture is a special case of PEN and we can therefore also target fully exchangeable data. We employ PENs to learn summary statistics in approximate Bayesian computation (ABC). When comparing PENs to previous deep learning methods for learning summary statistics, our results are highly competitive, both considering time series and static models. Indeed, PENs provide more reliable posterior samples even when using less training data.

قيم البحث

119 - Carlos D. Zuluaga , Edgar A. Valencia , Mauricio A. Alvarez 2015

Approximate Bayesian Computation (ABC) are likelihood-free Monte Carlo methods. ABC methods use a comparison between simulated data, using different parameters drew from a prior distribution, and observed data. This comparison process is based on com puting a distance between the summary statistics from the simulated data and the observed data. For complex models, it is usually difficult to define a methodology for choosing or constructing the summary statistics. Recently, a nonparametric ABC has been proposed, that uses a dissimilarity measure between discrete distributions based on empirical kernel embeddings as an alternative for summary statistics. The nonparametric ABC outperforms other methods including ABC, kernel ABC or synthetic likelihood ABC. However, it assumes that the probability distributions are discrete, and it is not robust when dealing with few observations. In this paper, we propose to apply kernel embeddings using an smoother density estimator or Parzen estimator for comparing the empirical data distributions, and computing the ABC posterior. Synthetic data and real data were used to test the Bayesian inference of our method. We compare our method with respect to state-of-the-art methods, and demonstrate that our method is a robust estimator of the posterior distribution in terms of the number of observations.

التعلم الالي

Some discussions of D. Fearnhead and D. Prangles Read Paper Constructing summary statistics for approximate Bayesian computation: semi-automatic approximate Bayesian computation

146 - Christophe Andrieu , Simon Barthelme , Nicolas Chopin 2012

This report is a collection of comments on the Read Paper of Fearnhead and Prangle (2011), to appear in the Journal of the Royal Statistical Society Series B, along with a reply from the authors.

المنهجية حساب

Learning Summary Statistic for Approximate Bayesian Computation via Deep Neural Network

81 - Bai Jiang , Tung-yu Wu , Charles Zheng 2015

Approximate Bayesian Computation (ABC) methods are used to approximate posterior distributions in models with unknown or computationally intractable likelihoods. Both the accuracy and computational efficiency of ABC depend on the choice of summary st atistic, but outside of special cases where the optimal summary statistics are known, it is unclear which guiding principles can be used to construct effective summary statistics. In this paper we explore the possibility of automating the process of constructing summary statistics by training deep neural networks to predict the parameters from artificially generated data: the resulting summary statistics are approximately posterior means of the parameters. With minimal model-specific tuning, our method constructs summary statistics for the Ising model and the moving-average model, which match or exceed theoretically-motivated summary statistics in terms of the accuracies of the resulting posteriors.

المنهجية حساب التعلم الالي

Bayesian Learning of Neural Network Architectures

155 - Georgi Dikov , Patrick van der Smagt , Justin Bayer 2019

In this paper we propose a Bayesian method for estimating architectural parameters of neural networks, namely layer size and network depth. We do this by learning concrete distributions over these parameters. Our results show that regular networks wi th a learnt structure can generalise better on small datasets, while fully stochastic networks can be more robust to parameter initialisation. The proposed method relies on standard neural variational learning and, unlike randomised architecture search, does not require a retraining of the model, thus keeping the computational overhead at minimum.

التعلم الالي التعلم الآلي

Loss-Calibrated Approximate Inference in Bayesian Neural Networks

74 - Adam D. Cobb , Stephen J. Roberts , Yarin Gal 2018

Current approaches in approximate inference for Bayesian neural networks minimise the Kullback-Leibler divergence to approximate the true posterior over the weights. However, this approximation is without knowledge of the final application, and there fore cannot guarantee optimal predictions for a given task. To make more suitable task-specific approximations, we introduce a new loss-calibrated evidence lower bound for Bayesian neural networks in the context of supervised learning, informed by Bayesian decision theory. By introducing a lower bound that depends on a utility function, we ensure that our approximation achieves higher utility than traditional methods for applications that have asymmetric utility functions. Furthermore, in using dropout inference, we highlight that our new objective is identical to that of standard dropout neural networks, with an additional utility-dependent penalty term. We demonstrate our new loss-calibrated model with an illustrative medical example and a restricted model capacity experiment, and highlight failure modes of the comparable weighted cross entropy approach. Lastly, we demonstrate the scalability of our method to real world applications with per-pixel semantic segmentation on an autonomous driving data set.

التعلم الالي التعلم الآلي