New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Scalable Bayesian Nonparametric Clustering and Classification

96 0 0.0 ( 0 )

Download Cite

Added by Yang Ni

Publication date 2018

fields Mathematical Statistics

and research's language is English

Authors Yang Ni - Peter Muller - Maurice Diesendruck

Computation Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We develop a scalable multi-step Monte Carlo algorithm for inference under a large class of nonparametric Bayesian models for clustering and classification. Each step is embarrassingly parallel and can be implemented using the same Markov chain Monte Carlo sampler. The simplicity and generality of our approach makes inference for a wide range of Bayesian nonparametric mixture models applicable to large datasets. Specifically, we apply the approach to inference under a product partition model with regression on covariates. We show results for inference with two motivating data sets: a large set of electronic health records (EHR) and a bank telemarketing dataset. We find interesting clusters and favorable classification performance relative to other widely used competing classifiers.

rate research

Nonparametric Estimation of the Random Coefficients Model in Python

67 - Emil Mendoza , Fabian Dunker , Marco Reale 2021

We present $textbf{PyRMLE}$, a Python module that implements Regularized Maximum Likelihood Estimation for the analysis of Random Coefficient models. $textbf{PyRMLE}$ is simple to use and readily works with data formats that are typical to Random Coefficient problems. The module makes use of Pythons scientific libraries $textbf{NumPy}$ and $textbf{SciPy}$ for computational efficiency. The main implementation of the algorithm is executed purely in Python code which takes advantage of Pythons high-level features.

Computation Methodology

A Bayesian Nonparametric Estimation of Mutual Information

124 - Luai Al-Labadi , Forough Fazeli-Asl , Zahra Saberi 2021

Mutual information is a widely-used information theoretic measure to quantify the amount of association between variables. It is used extensively in many applications such as image registration, diagnosis of failures in electrical machines, pattern recognition, data mining and tests of independence. The main goal of this paper is to provide an efficient estimator of the mutual information based on the approach of Al Labadi et. al. (2021). The estimator is explored through various examples and is compared to its frequentist counterpart due to Berrett et al. (2019). The results show the good performance of the procedure by having a smaller mean squared error.

Computation Information Theory Information Theory

Stratified sampling and bootstrapping for approximate Bayesian computation

163 - Umberto Picchini , Richard G. Everitt 2019

Approximate Bayesian computation (ABC) is computationally intensive for complex model simulators. To exploit expensive simulations, data-resampling via bootstrapping can be employed to obtain many artificial datasets at little cost. However, when using this approach within ABC, the posterior variance is inflated, thus resulting in biased posterior inference. Here we use stratified Monte Carlo to considerably reduce the bias induced by data resampling. We also show empirically that it is possible to obtain reliable inference using a larger than usual ABC threshold. Finally, we show that with stratified Monte Carlo we obtain a less variable ABC likelihood. Ultimately we show how our approach improves the computational efficiency of the ABC samplers. We construct several ABC samplers employing our methodology, such as rejection and importance ABC samplers, and ABC-MCMC samplers. We consider simulation studies for static (Gaussian, g-and-k distribution, Ising model, astronomical model) and dynamic models (Lotka-Volterra). We compare against state-of-art sequential Monte Carlo ABC samplers, synthetic likelihoods, and likelihood-free Bayesian optimization. For a computationally expensive Lotka-Volterra case study, we found that our strategy leads to a more than 10-fold computational saving, compared to a sampler that does not use our novel approach.

Computation Methodology

Exact Bayesian Analysis of Mixtures

148 - Christian P. Robert , Kerrien L. Mengersen (Queensland University of Technology 2010

In this paper, we show how a complete and exact Bayesian analysis of a parametric mixture model is possible in some cases when components of the mixture are taken from exponential families and when conjugate priors are used. This restricted set-up allows us to show the relevance of the Bayesian approach as well as to exhibit the limitations of a complete analysis, namely that it is impossible to conduct this analysis when the sample size is too large, when the data are not from an exponential family, or when priors that are more complex than conjugate priors are used.

Computation Methodology

Bayesian nonparametric spectral density estimation using B-spline priors

162 - Matthew C. Edwards , Renate Meyer , 2017

We present a new Bayesian nonparametric approach to estimating the spectral density of a stationary time series. A nonparametric prior based on a mixture of B-spline distributions is specified and can be regarded as a generalization of the Bernstein polynomial prior of Petrone (1999a,b) and Choudhuri et al. (2004). Whittles likelihood approximation is used to obtain the pseudo-posterior distribution. This method allows for a data-driven choice of the number of mixture components and the location of knots. Posterior samples are obtained using a Metropolis-within-Gibbs Markov chain Monte Carlo algorithm, and mixing is improved using parallel tempering. We conduct a simulation study to demonstrate that for complicated spectral densities, the B-spline prior provides more accurate Monte Carlo estimates in terms of $L_1$-error and uniform coverage probabilities than the Bernstein polynomial prior. We apply the algorithm to annual mean sunspot data to estimate the solar cycle. Finally, we demonstrate the algorithms ability to estimate a spectral density with sharp features, using real gravitational wave detector data from LIGOs sixth science run, recoloured to match the Advanced LIGO target sensitivity.

Computation

comments

Fetching comments

American University of Beirut

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Scalable Bayesian Nonparametric Clustering and Classification

Ask ChatGPT about the research

No Arabic abstract

Read More