ترغب بنشر مسار تعليمي؟ اضغط هنا

Bayesian Variable Selection for Single Index Logistic Model

212   0   0.0 ( 0 )
 نشر من قبل Hangjin Jiang
 تاريخ النشر 2020
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

In the era of big data, variable selection is a key technology for handling high-dimensional problems with a small sample size but a large number of covariables. Different variable selection methods were proposed for different models, such as linear model, logistic model and generalized linear model. However, fewer works focused on variable selection for single index models, especially, for single index logistic model, due to the difficulty arose from the unknown link function and the slow mixing rate of MCMC algorithm for traditional logistic model. In this paper, we proposed a Bayesian variable selection procedure for single index logistic model by taking the advantage of Gaussian process and data augmentation. Numerical results from simulations and real data analysis show the advantage of our method over the state of arts.



قيم البحث

اقرأ أيضاً

In this article, we propose new Bayesian methods for selecting and estimating a sparse coefficient vector for skewed heteroscedastic response. Our novel Bayesian procedures effectively estimate the median and other quantile functions, accommodate non -local prior for regression effects without compromising ease of implementation via sampling based tools, and asymptotically select the true set of predictors even when the number of covariates increases in the same order of the sample size. We also extend our method to deal with some observations with very large errors. Via simulation studies and a re-analysis of a medical cost study with large number of potential predictors, we illustrate the ease of implementation and other practical advantages of our approach compared to existing methods for such studies.
We develop a Bayesian variable selection method, called SVEN, based on a hierarchical Gaussian linear model with priors placed on the regression coefficients as well as on the model space. Sparsity is achieved by using degenerate spike priors on inac tive variables, whereas Gaussian slab priors are placed on the coefficients for the important predictors making the posterior probability of a model available in explicit form (up to a normalizing constant). The strong model selection consistency is shown to be attained when the number of predictors grows nearly exponentially with the sample size and even when the norm of mean effects solely due to the unimportant variables diverge, which is a novel attractive feature. An appealing byproduct of SVEN is the construction of novel model weight adjusted prediction intervals. Embedding a unique model based screening and using fast Cholesky updates, SVEN produces a highly scalable computational framework to explore gigantic model spaces, rapidly identify the regions of high posterior probabilities and make fast inference and prediction. A temperature schedule guided by our model selection consistency derivations is used to further mitigate multimodal posterior distributions. The performance of SVEN is demonstrated through a number of simulation experiments and a real data example from a genome wide association study with over half a million markers.
An important task in building regression models is to decide which regressors should be included in the final model. In a Bayesian approach, variable selection can be performed using mixture priors with a spike and a slab component for the effects su bject to selection. As the spike is concentrated at zero, variable selection is based on the probability of assigning the corresponding regression effect to the slab component. These posterior inclusion probabilities can be determined by MCMC sampling. In this paper we compare the MCMC implementations for several spike and slab priors with regard to posterior inclusion probabilities and their sampling efficiency for simulated data. Further, we investigate posterior inclusion probabilities analytically for different slabs in two simple settings. Application of variable selection with spike and slab priors is illustrated on a data set of psychiatric patients where the goal is to identify covariates affecting metabolism.
Yang et al. (2016) proved that the symmetric random walk Metropolis--Hastings algorithm for Bayesian variable selection is rapidly mixing under mild high-dimensional assumptions. We propose a novel MCMC sampler using an informed proposal scheme, whic h we prove achieves a much faster mixing time that is independent of the number of covariates, under the same assumptions. To the best of our knowledge, this is the first high-dimensional result which rigorously shows that the mixing rate of informed MCMC methods can be fast enough to offset the computational cost of local posterior evaluation. Motivated by the theoretical analysis of our sampler, we further propose a new approach called two-stage drift condition to studying convergence rates of Markov chains on general state spaces, which can be useful for obtaining tight complexity bounds in high-dimensional settings. The practical advantages of our algorithm are illustrated by both simulation studies and real data analysis.
In this paper we review the concepts of Bayesian evidence and Bayes factors, also known as log odds ratios, and their application to model selection. The theory is presented along with a discussion of analytic, approximate and numerical techniques. S pecific attention is paid to the Laplace approximation, variational Bayes, importance sampling, thermodynamic integration, and nested sampling and its recent variants. Analogies to statistical physics, from which many of these techniques originate, are discussed in order to provide readers with deeper insights that may lead to new techniques. The utility of Bayesian model testing in the domain sciences is demonstrated by presenting four specific practical examples considered within the context of signal processing in the areas of signal detection, sensor characterization, scientific model selection and molecular force characterization.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا