Bayesian semiparametric analysis of multivariate continuous responses, with variable selection


الملخص بالإنكليزية

This article presents an approach to Bayesian semiparametric inference for Gaussian multivariate response regression. We are motivated by various small and medium dimensional problems from the physical and social sciences. The statistical challenges revolve around dealing with the unknown mean and variance functions and in particular, the correlation matrix. To tackle these problems, we have developed priors over the smooth functions and a Markov chain Monte Carlo algorithm for inference and model selection. Specifically, Dirichlet process mixtures of Gaussian distributions are used as the basis for a cluster-inducing prior over the elements of the correlation matrix. The smooth, multidimensional means and variances are represented using radial basis function expansions. The complexity of the model, in terms of variable selection and smoothness, is then controlled by spike-slab priors. A simulation study is presented, demonstrating performance as the response dimension increases. Finally, the model is fit to a number of real world datasets. An R package, scripts for replicating synthetic and real data examples, and a detailed description of the MCMC sampler are available in the supplementary materials online.

تحميل البحث