ﻻ يوجد ملخص باللغة العربية
Insights into complex, high-dimensional data can be obtained by discovering features of the data that match or do not match a model of interest. To formalize this task, we introduce the data selection problem: finding a lower-dimensional statistic - such as a subset of variables - that is well fit by a given parametric model of interest. A fully Bayesian approach to data selection would be to parametrically model the value of the statistic, nonparametrically model the remaining background components of the data, and perform standard Bayesian model selection for the choice of statistic. However, fitting a nonparametric model to high-dimensional data tends to be highly inefficient, statistically and computationally. We propose a novel score for performing both data selection and model selection, the Stein volume criterion, that takes the form of a generalized marginal likelihood with a kernelized Stein discrepancy in place of the Kullback-Leibler divergence. The Stein volume criterion does not require one to fit or even specify a nonparametric background model, making it straightforward to compute - in many cases it is as simple as fitting the parametric model of interest with an alternative objective function. We prove that the Stein volume criterion is consistent for both data selection and model selection, and we establish consistency and asymptotic normality (Bernstein-von Mises) of the corresponding generalized posterior on parameters. We validate our method in simulation and apply it to the analysis of single-cell RNA sequencing datasets using probabilistic principal components analysis and a spin glass model of gene regulation.
In this paper we review the concepts of Bayesian evidence and Bayes factors, also known as log odds ratios, and their application to model selection. The theory is presented along with a discussion of analytic, approximate and numerical techniques. S
In this article, we propose new Bayesian methods for selecting and estimating a sparse coefficient vector for skewed heteroscedastic response. Our novel Bayesian procedures effectively estimate the median and other quantile functions, accommodate non
In the era of big data, variable selection is a key technology for handling high-dimensional problems with a small sample size but a large number of covariables. Different variable selection methods were proposed for different models, such as linear
An important task in building regression models is to decide which regressors should be included in the final model. In a Bayesian approach, variable selection can be performed using mixture priors with a spike and a slab component for the effects su
We propose a novel spike and slab prior specification with scaled beta prime marginals for the importance parameters of regression coefficients to allow for general effect selection within the class of structured additive distributional regression. T