ترغب بنشر مسار تعليمي؟ اضغط هنا

Bayesian Effect Selection in Structured Additive Distributional Regression Models

159   0   0.0 ( 0 )
 نشر من قبل Nadja Klein Prof. Dr.
 تاريخ النشر 2019
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

We propose a novel spike and slab prior specification with scaled beta prime marginals for the importance parameters of regression coefficients to allow for general effect selection within the class of structured additive distributional regression. This enables us to model effects on all distributional parameters for arbitrary parametric distributions, and to consider various effect types such as non-linear or spatial effects as well as hierarchical regression structures. Our spike and slab prior relies on a parameter expansion that separates blocks of regression coefficients into overall scalar importance parameters and vectors of standardised coefficients. Hence, we can work with a scalar quantity for effect selection instead of a possibly high-dimensional effect vector, which yields improved shrinkage and sampling performance compared to the classical normal-inverse-gamma prior. We investigate the propriety of the posterior, show that the prior yields desirable shrinkage properties, propose a way of eliciting prior parameters and provide efficient Markov Chain Monte Carlo sampling. Using both simulated and three large-scale data sets, we show that our approach is applicable for data with a potentially large number of covariates, multilevel predictors accounting for hierarchically nested data and non-standard response distributions, such as bivariate normal or zero-inflated Poisson.



قيم البحث

اقرأ أيضاً

101 - Bruno Santos , Thomas Kneib 2019
Quantile regression models are a powerful tool for studying different points of the conditional distribution of univariate response variables. Their multivariate counterpart extension though is not straightforward, starting with the definition of mul tivariate quantiles. We propose here a flexible Bayesian quantile regression model when the response variable is multivariate, where we are able to define a structured additive framework for all predictor variables. We build on previous ideas considering a directional approach to define the quantiles of a response variable with multiple-outputs and we define noncrossing quantiles in every directional quantile model. We define a Markov Chain Monte Carlo (MCMC) procedure for model estimation, where the noncrossing property is obtained considering a Gaussian process design to model the correlation between several quantile regression models. We illustrate the results of these models using two data sets: one on dimensions of inequality in the population, such as income and health; the second on scores of students in the Brazilian High School National Exam, considering three dimensions for the response variable.
This paper considers the problem of variable selection in regression models in the case of functional variables that may be mixed with other type of variables (scalar, multivariate, directional, etc.). Our proposal begins with a simple null model and sequentially selects a new variable to be incorporated into the model based on the use of distance correlation proposed by cite{Szekely2007}. For the sake of simplicity, this paper only uses additive models. However, the proposed algorithm may assess the type of contribution (linear, non linear, ...) of each variable. The algorithm has shown quite promising results when applied to simulations and real data sets.
97 - Nadja Klein , Jorge Mateu 2021
Statistical techniques used in air pollution modelling usually lack the possibility to understand which predictors affect air pollution in which functional form; and are not able to regress on exceedances over certain thresholds imposed by authoritie s directly. The latter naturally induce conditional quantiles and reflect the seriousness of particular events. In the present paper we focus on this important aspect by developing quantile regression models further. We propose a general Bayesian effect selection approach for additive quantile regression within a highly interpretable framework. We place separate normal beta prime spike and slab priors on the scalar importance parameters of effect parts and implement a fast Gibbs sampling scheme. Specifically, it enables to study quantile-specific covariate effects, allows these covariates to be of general functional form using additive predictors, and facilitates the analysts decision whether an effect should be included linearly, non-linearly or not at all in the quantiles of interest. In a detailed analysis on air pollution data in Madrid (Spain) we find the added value of modelling extreme nitrogen dioxide (NO2) concentrations and how thresholds are driven differently by several climatological variables and traffic as a spatial proxy. Our results underpin the need of enhanced statistical models to support short-term decisions and enable local authorities to mitigate or even prevent exceedances of NO2 concentration limits.
We develop a Bayesian sum-of-trees model where each tree is constrained by a regularization prior to be a weak learner, and fitting and inference are accomplished via an iterative Bayesian backfitting MCMC algorithm that generates samples from a post erior. Effectively, BART is a nonparametric Bayesian regression approach which uses dimensionally adaptive random basis elements. Motivated by ensemble methods in general, and boosting algorithms in particular, BART is defined by a statistical model: a prior and a likelihood. This approach enables full posterior inference including point and interval estimates of the unknown regression function as well as the marginal effects of potential predictors. By keeping track of predictor inclusion frequencies, BART can also be used for model-free variable selection. BARTs many features are illustrated with a bake-off against competing methods on 42 different data sets, with a simulation experiment and on a drug discovery classification problem.
93 - Hao Ran , Yang Bai 2021
In many longitudinal studies, the covariate and response are often intermittently observed at irregular, mismatched and subject-specific times. How to deal with such data when covariate and response are observed asynchronously is an often raised prob lem. Bayesian Additive Regression Trees(BART) is a Bayesian non-Parametric approach which has been shown to be competitive with the best modern predictive methods such as random forest and boosted decision trees. The sum of trees structure combined with a Bayesian inferential framework provide a accurate and robust statistic method. BART variant soft Bayesian Additive Regression Trees(SBART) constructed using randomized decision trees was developed and substantial theoretical and practical benefits were shown. In this paper, we propose a weighted SBART model solution for asynchronous longitudinal data. In comparison to other methods, the current methods are valid under with little assumptions on the covariate process. Extensive simulation studies provide numerical support for this solution. And data from an HIV study is used to illustrate our methodology
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا