An Empirical Study of Stochastic Variational Algorithms for the Beta Bernoulli Process

806 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Amar Shah

تاريخ النشر 2015

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Amar Shah - David A. Knowles - Zoubin Ghahramani

التعلم الالي التعلم الآلي تطبيقات الإحصاء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Stochastic variational inference (SVI) is emerging as the most promising candidate for scaling inference in Bayesian probabilistic models to large datasets. However, the performance of these methods has been assessed primarily in the context of Bayesian topic models, particularly latent Dirichlet allocation (LDA). Deriving several new algorithms, and using synthetic, image and genomic datasets, we investigate whether the understanding gleaned from LDA applies in the setting of sparse latent factor models, specifically beta process factor analysis (BPFA). We demonstrate that the big picture is consistent: using Gibbs sampling within SVI to maintain certain posterior dependencies is extremely effective. However, we find that different posterior dependencies are important in BPFA relative to LDA. Particularly, approximations able to model intra-local variable dependence perform best.

قيم البحث

91 - Xueying Tang , Zhi Wang , Jingchen Liu 2019

Computer simulations have become a popular tool of assessing complex skills such as problem-solving skills. Log files of computer-based items record the entire human-computer interactive processes for each respondent. The response processes are very diverse, noisy, and of nonstandard formats. Few generic methods have been developed for exploiting the information contained in process data. In this article, we propose a method to extract latent variables from process data. The method utilizes a sequence-to-sequence autoencoder to compress response processes into standard numerical vectors. It does not require prior knowledge of the specific items and human-computers interaction patterns. The proposed method is applied to both simulated and real process data to demonstrate that the resulting latent variables extract useful information from the response processes.

التعلم الالي التعلم الآلي تطبيقات الإحصاء

Stochastic Variational Optimization

172 - Thomas Bird , Julius Kunze , David Barber 2018

Variational Optimization forms a differentiable upper bound on an objective. We show that approaches such as Natural Evolution Strategies and Gaussian Perturbation, are special cases of Variational Optimization in which the expectations are approxima ted by Gaussian sampling. These approaches are of particular interest because they are parallelizable. We calculate the approximate bias and variance of the corresponding gradient estimators and demonstrate that using antithetic sampling or a baseline is crucial to mitigate their problems. We contrast these methods with an alternative parallelizable method, namely Directional Derivatives. We conclude that, for differentiable objectives, using Directional Derivatives is preferable to using Variational Optimization to perform parallel Stochastic Gradient Descent.

التعلم الالي التعلم الآلي

Structure Adaptive Algorithms for Stochastic Bandits

247 - Remy Degenne , Han Shao , Wouter M. Koolen 2020

We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, where the mean rewards of arms satisfy some given structural constraints, e.g. linear, unimodal, sparse, etc. Our aim is to develop methods that are fl exible (in that they easily adapt to different structures), powerful (in that they perform well empirically and/or provably match instance-dependent lower bounds) and efficient in that the per-round computational burden is small. We develop asymptotically optimal algorithms from instance-dependent lower-bounds using iterative saddle-point solvers. Our approach generalises recent iterative methods for pure exploration to reward maximisation, where a major challenge arises from the estimation of the sub-optimality gaps and their reciprocals. Still we manage to achieve all the above desiderata. Notably, our technique avoids the computational cost of the full-blown saddle point oracle employed by previous work, while at the same time enabling finite-time regret bounds. Our experiments reveal that our method successfully leverages the structural assumptions, while its regret is at worst comparable to that of vanilla UCB.

التعلم الالي التعلم الآلي

Sparse Gaussian Process Variational Autoencoders

120 - Matthew Ashman , Jonathan So , Will Tebbutt 2020

Large, multi-dimensional spatio-temporal datasets are omnipresent in modern science and engineering. An effective framework for handling such data are Gaussian process deep generative models (GP-DGMs), which employ GP priors over the latent variables of DGMs. Existing approaches for performing inference in GP-DGMs do not support sparse GP approximations based on inducing points, which are essential for the computational efficiency of GPs, nor do they handle missing data -- a natural occurrence in many spatio-temporal datasets -- in a principled manner. We address these shortcomings with the development of the sparse Gaussian process variational autoencoder (SGP-VAE), characterised by the use of partial inference networks for parameterising sparse GP approximations. Leveraging the benefits of amortised variational inference, the SGP-VAE enables inference in multi-output sparse GPs on previously unobserved data with no additional training. The SGP-VAE is evaluated in a variety of experiments where it outperforms alternative approaches including multi-output GPs and structured VAEs.

التعلم الالي التعلم الآلي الحوسبة العصبية والتطورية

Scalable Gaussian Process Variational Autoencoders

169 - Metod Jazbec , Matthew Ashman , Vincent Fortuin 2020

Conventional variational autoencoders fail in modeling correlations between data points due to their use of factorized priors. Amortized Gaussian process inference through GP-VAEs has led to significant improvements in this regard, but is still inhib ited by the intrinsic complexity of exact GP inference. We improve the scalability of these methods through principled sparse inference approaches. We propose a new scalable GP-VAE model that outperforms existing approaches in terms of runtime and memory footprint, is easy to implement, and allows for joint end-to-end optimization of all components.

التعلم الالي التعلم الآلي