No Arabic abstract
Bayesian methods - either based on Bayes Factors or BIC - are now widely used for model selection. One property that might reasonably be demanded of any model selection method is that if a model ${M}_{1}$ is preferred to a model ${M}_{0}$, when these two models are expressed as members of one model class $mathbb{M}$, this preference is preserved when they are embedded in a different class $mathbb{M}$. However, we illustrate in this paper that with the usual implementation of these common Bayesian procedures this property does not hold true even approximately. We therefore contend that to use these methods it is first necessary for there to exist a natural embedding class. We argue that in any context like the one illustrated in our running example of Bayesian model selection of binary phylogenetic trees there is no such embedding.
In this article, we propose new Bayesian methods for selecting and estimating a sparse coefficient vector for skewed heteroscedastic response. Our novel Bayesian procedures effectively estimate the median and other quantile functions, accommodate non-local prior for regression effects without compromising ease of implementation via sampling based tools, and asymptotically select the true set of predictors even when the number of covariates increases in the same order of the sample size. We also extend our method to deal with some observations with very large errors. Via simulation studies and a re-analysis of a medical cost study with large number of potential predictors, we illustrate the ease of implementation and other practical advantages of our approach compared to existing methods for such studies.
Gaussian graphical models (GGMs) are well-established tools for probabilistic exploration of dependence structures using precision matrices. We develop a Bayesian method to incorporate covariate information in this GGMs setup in a nonlinear seemingly unrelated regression framework. We propose a joint predictor and graph selection model and develop an efficient collapsed Gibbs sampler algorithm to search the joint model space. Furthermore, we investigate its theoretical variable selection properties. We demonstrate our method on a variety of simulated data, concluding with a real data set from the TCPA project.
We develop a Bayesian methodology aimed at simultaneously estimating low-rank and row-sparse matrices in a high-dimensional multiple-response linear regression model. We consider a carefully devised shrinkage prior on the matrix of regression coefficients which obviates the need to specify a prior on the rank, and shrinks the regression matrix towards low-rank and row-sparse structures. We provide theoretical support to the proposed methodology by proving minimax optimality of the posterior mean under the prediction risk in ultra-high dimensional settings where the number of predictors can grow sub-exponentially relative to the sample size. A one-step post-processing scheme induced by group lasso penalties on the rows of the estimated coefficient matrix is proposed for variable selection, with default choices of tuning parameters. We additionally provide an estimate of the rank using a novel optimization function achieving dimension reduction in the covariate space. We exhibit the performance of the proposed methodology in an extensive simulation study and a real data example.
Objective Bayesian inference procedures are derived for the parameters of the multivariate random effects model generalized to elliptically contoured distributions. The posterior for the overall mean vector and the between-study covariance matrix is deduced by assigning two noninformative priors to the model parameter, namely the Berger and Bernardo reference prior and the Jeffreys prior, whose analytical expressions are obtained under weak distributional assumptions. It is shown that the only condition needed for the posterior to be proper is that the sample size is larger than the dimension of the data-generating model, independently of the class of elliptically contoured distributions used in the definition of the generalized multivariate random effects model. The theoretical findings of the paper are applied to real data consisting of ten studies about the effectiveness of hypertension treatment for reducing blood pressure where the treatment effects on both the systolic blood pressure and diastolic blood pressure are investigated.
While there have been a lot of recent developments in the context of Bayesian model selection and variable selection for high dimensional linear models, there is not much work in the presence of change point in literature, unlike the frequentist counterpart. We consider a hierarchical Bayesian linear model where the active set of covariates that affects the observations through a mean model can vary between different time segments. Such structure may arise in social sciences/ economic sciences, such as sudden change of house price based on external economic factor, crime rate changes based on social and built-environment factors, and others. Using an appropriate adaptive prior, we outline the development of a hierarchical Bayesian methodology that can select the true change point as well as the true covariates, with high probability. We provide the first detailed theoretical analysis for posterior consistency with or without covariates, under suitable conditions. Gibbs sampling techniques provide an efficient computational strategy. We also consider small sample simulation study as well as application to crime forecasting applications.