مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Variational Bayes for high-dimensional linear regression with sparse priors

260 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Kolyan Ray

تاريخ النشر 2019

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Kolyan Ray - Botond Szabo

المنهجية نظرية الإحصاء التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We study a mean-field spike and slab variational Bayes (VB) approximation to Bayesian model selection priors in sparse high-dimensional linear regression. Under compatibility conditions on the design matrix, oracle inequalities are derived for the mean-field VB approximation, implying that it converges to the sparse truth at the optimal rate and gives optimal prediction of the response vector. The empirical performance of our algorithm is studied, showing that it works comparably well as other state-of-the-art Bayesian variable selection methods. We also numerically demonstrate that the widely used coordinate-ascent variational inference (CAVI) algorithm can be highly sensitive to the parameter updating order, leading to potentially poor performance. To mitigate this, we propose a novel prioritized updating scheme that uses a data-driven updating order and performs better in simulations. The variational algorithm is implemented in the R package sparsevb.

قيم البحث

85 - Yue Yang , Ryan Martin 2020

In high-dimensions, the prior tails can have a significant effect on both posterior computation and asymptotic concentration rates. To achieve optimal rates while keeping the posterior computations relatively simple, an empirical Bayes approach has r ecently been proposed, featuring thin-tailed conjugate priors with data-driven centers. While conjugate priors ease some of the computational burden, Markov chain Monte Carlo methods are still needed, which can be expensive when dimension is high. In this paper, we develop a variational approximation to the empirical Bayes posterior that is fast to compute and retains the optimal concentration rate properties of the original. In simulations, our method is shown to have superior performance compared to existing variational approximations in the literature across a wide range of high-dimensional settings.

المنهجية نظرية الإحصاء التعلم الالي

Optimal Semi-supervised Estimation and Inference for High-dimensional Linear Regression

85 - Siyi Deng , Yang Ning , Jiwei Zhao 2020

There are many scenarios such as the electronic health records where the outcome is much more difficult to collect than the covariates. In this paper, we consider the linear regression problem with such a data structure under the high dimensionality. Our goal is to investigate when and how the unlabeled data can be exploited to improve the estimation and inference of the regression parameters in linear models, especially in light of the fact that such linear models may be misspecified in data analysis. In particular, we address the following two important questions. (1) Can we use the labeled data as well as the unlabeled data to construct a semi-supervised estimator such that its convergence rate is faster than the supervised estimators? (2) Can we construct confidence intervals or hypothesis tests that are guaranteed to be more efficient or powerful than the supervised estimators? To address the first question, we establish the minimax lower bound for parameter estimation in the semi-supervised setting. We show that the upper bound from the supervised estimators that only use the labeled data cannot attain this lower bound. We close this gap by proposing a new semi-supervised estimator which attains the lower bound. To address the second question, based on our proposed semi-supervised estimator, we propose two additional estimators for semi-supervised inference, the efficient estimator and the safe estimator. The former is fully efficient if the unknown conditional mean function is estimated consistently, but may not be more efficient than the supervised approach otherwise. The latter usually does not aim to provide fully efficient inference, but is guaranteed to be no worse than the supervised approach, no matter whether the linear model is correctly specified or the conditional mean function is consistently estimated.

المنهجية نظرية الإحصاء التعلم الالي

Variational Inference in high-dimensional linear regression

144 - Sumit Mukherjee , Subhabrata Sen 2021

We study high-dimensional Bayesian linear regression with product priors. Using the nascent theory of non-linear large deviations (Chatterjee and Dembo,2016), we derive sufficient conditions for the leading-order correctness of the naive mean-field a pproximation to the log-normalizing constant of the posterior distribution. Subsequently, assuming a true linear model for the observed data, we derive a limiting infinite dimensional variational formula for the log normalizing constant of the posterior. Furthermore, we establish that under an additional separation condition, the variational problem has a unique optimizer, and this optimizer governs the probabilistic properties of the posterior distribution. We provide intuitive sufficient conditions for the validity of this separation condition. Finally, we illustrate our results on concrete examples with specific design matrices.

نظرية الإحصاء الاحتمالات التعلم الالي

Bayesian high-dimensional linear regression with generic spike-and-slab priors

82 - Bai Jiang , Qiang Sun 2019

Spike-and-slab priors are popular Bayesian solutions for high-dimensional linear regression problems. Previous theoretical studies on spike-and-slab methods focus on specific prior formulations and use prior-dependent conditions and analyses, and thu s can not be generalized directly. In this paper, we propose a class of generic spike-and-slab priors and develop a unified framework to rigorously assess their theoretical properties. Technically, we provide general conditions under which generic spike-and-slab priors can achieve the nearly-optimal posterior contraction rate and the model selection consistency. Our results include those of Narisetty and He (2014) and Castillo et al. (2015) as special cases.

نظرية الإحصاء المنهجية التعلم الالي

Robust and sparse estimation methods for high dimensional linear and logistic regression

200 - Fatma Sevinc Kurnaz , Irene Hoffmann , Peter Filzmoser 2017

Fully robu

المنهجية

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة قاسيون الخاصة للعلوم والتكنولوجيا

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Variational Bayes for high-dimensional linear regression with sparse priors

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً