Sublinear-Time Approximate MCMC Transitions for Probabilistic Programs

660 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yutian Chen

تاريخ النشر 2014

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Yutian Chen - Vikash Mansinghka - Zoubin Ghahramani

التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Probabilistic programming languages can simplify the development of machine learning techniques, but only if inference is sufficiently scalable. Unfortunately, Bayesian parameter estimation for highly coupled models such as regressions and state-space models still scales poorly; each MCMC transition takes linear time in the number of observations. This paper describes a sublinear-time algorithm for making Metropolis-Hastings (MH) updates to latent variables in probabilistic programs. The approach generalizes recently introduced approximate MH techniques: instead of subsampling data items assumed to be independent, it subsamples edges in a dynamically constructed graphical model. It thus applies to a broader class of problems and interoperates with other general-purpose inference techniques. Empirical results, including confirmation of sublinear per-transition scaling, are presented for Bayesian logistic regression, nonlinear classification via joint Dirichlet process mixtures, and parameter estimation for stochastic volatility models (with state estimation via particle MCMC). All three applications use the same implementation, and each requires under 20 lines of probabilistic code.

قيم البحث

85 - Tom Rainforth , Tuan Anh Le , Jan-Willem van de Meent 2017

We present the first general purpose framework for marginal maximum a posteriori estimation of probabilistic program variables. By using a series of code transformations, the evidence of any probabilistic program, and therefore of any graphical model , can be optimized with respect to an arbitrary subset of its sampled variables. To carry out this optimization, we develop the first Bayesian optimization package to directly exploit the source code of its target, leading to innovations in problem-independent hyperpriors, unbounded optimization, and implicit constraint satisfaction; delivering significant performance improvements over prominent existing packages. We present applications of our method to a number of tasks including engineering design and parameter optimization.

التعلم الالي الذكاء الاصطناعي لغات البرمجة

Nesting Probabilistic Programs

57 - Tom Rainforth 2018

We formalize the notion of nesting probabilistic programming queries and investigate the resulting statistical implications. We demonstrate that while query nesting allows the definition of models which could not otherwise be expressed, such as those involving agents reasoning about other agents, existing systems take approaches which lead to inconsistent estimates. We show how to correct this by delineating possible ways one might want to nest queries and asserting the respective conditions required for convergence. We further introduce a new online nested Monte Carlo estimator that makes it substantially easier to ensure these conditions are met, thereby providing a simple framework for designing statistically correct inference engines. We prove the correctness of this online estimator and show that, when using the recommended setup, its asymptotic variance is always better than that of the equivalent fixed estimator, while its bias is always within a factor of two.

التعلم الالي لغات البرمجة حساب

Sampling for Bayesian Mixture Models: MCMC with Polynomial-Time Mixing

360 - Wenlong Mou , Nhat Ho , Martin J. Wainwright 2019

We study the problem of sampling from the power posterior distribution in Bayesian Gaussian mixture models, a robust version of the classical posterior. This power posterior is known to be non-log-concave and multi-modal, which leads to exponential m ixing times for some standard MCMC algorithms. We introduce and study the Reflected Metropolis-Hastings Random Walk (RMRW) algorithm for sampling. For symmetric two-component Gaussian mixtures, we prove that its mixing time is bounded as $d^{1.5}(d + Vert theta_{0} Vert^2)^{4.5}$ as long as the sample size $n$ is of the order $d (d + Vert theta_{0} Vert^2)$. Notably, this result requires no conditions on the separation of the two means. En route to proving this bound, we establish some new results of possible independent interest that allow for combining Poincar{e} inequalities for conditional and marginal densities.

التعلم الالي بنى وهياكل البيانات والخوارزميات التعلم الآلي

Learning Multi-layer Latent Variable Model via Variational Optimization of Short Run MCMC for Approximate Inference

354 - Erik Nijkamp , Bo Pang , Tian Han 2019

This paper studies the fundamental problem of learning deep generative models that consist of multiple layers of latent variables organized in top-down architectures. Such models have high expressivity and allow for learning hierarchical representati ons. Learning such a generative model requires inferring the latent variables for each training example based on the posterior distribution of these latent variables. The inference typically requires Markov chain Monte Caro (MCMC) that can be time consuming. In this paper, we propose to use noise initialized non-persistent short run MCMC, such as finite step Langevin dynamics initialized from the prior distribution of the latent variables, as an approximate inference engine, where the step size of the Langevin dynamics is variationally optimized by minimizing the Kullback-Leibler divergence between the distribution produced by the short run MCMC and the posterior distribution. Our experiments show that the proposed method outperforms variational auto-encoder (VAE) in terms of reconstruction error and synthesis quality. The advantage of the proposed method is that it is simple and automatic without the need to design an inference model.

التعلم الالي التعلم الآلي

Inferring Covariances for Probabilistic Programs

118 - Benjamin Lucien Kaminski , Joost-Pieter Katoen , Christoph Matheja 2016

We study weakest precondition reasoning about the (co)variance of outcomes and the variance of run-times of probabilistic programs with conditioning. For outcomes, we show that approximating (co)variances is computationally more difficult than approx imating expected values. In particular, we prove that computing both lower and upper bounds for (co)variances is $Sigma^{0}_{2}$-complete. As a consequence, neither lower nor upper bounds are computably enumerable. We therefore present invariant-based techniques that do enable enumeration of both upper and lower bounds, once appropriate invariants are found. Finally, we extend this approach to reasoning about run-time variances.

المنطق في علوم الحاسوب

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الشھباء الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Sublinear-Time Approximate MCMC Transitions for Probabilistic Programs

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً