Do you want to publish a course? Click here

Ridge Regression Revisited: Debiasing, Thresholding and Bootstrap

235   0   0.0 ( 0 )
 Added by Yunyi Zhang
 Publication date 2020
and research's language is English




Ask ChatGPT about the research

The success of the Lasso in the era of high-dimensional data can be attributed to its conducting an implicit model selection, i.e., zeroing out regression coefficients that are not significant. By contrast, classical ridge regression can not reveal a potential sparsity of parameters, and may also introduce a large bias under the high-dimensional setting. Nevertheless, recent work on the Lasso involves debiasing and thresholding, the latter in order to further enhance the model selection. As a consequence, ridge regression may be worth another look since -- after debiasing and thresholding -- it may offer some advantages over the Lasso, e.g., it can be easily computed using a closed-form expression. % and it has similar performance to threshold Lasso. In this paper, we define a debiased and thresholded ridge regression method, and prove a consistency result and a Gaussian approximation theorem. We further introduce a wild bootstrap algorithm to construct confidence regions and perform hypothesis testing for a linear combination of parameters. In addition to estimation, we consider the problem of prediction, and present a novel, hybrid bootstrap algorithm tailored for prediction intervals. Extensive numerical simulations further show that the debiased and thresholded ridge regression has favorable finite sample performance and may be preferable in some settings.



rate research

Read More

89 - Wenjia Wang , Yi-Hui Zhou 2020
In the multivariate regression, also referred to as multi-task learning in machine learning, the goal is to recover a vector-valued function based on noisy observations. The vector-valued function is often assumed to be of low rank. Although the multivariate linear regression is extensively studied in the literature, a theoretical study on the multivariate nonlinear regression is lacking. In this paper, we study reduced rank multivariate kernel ridge regression, proposed by cite{mukherjee2011reduced}. We prove the consistency of the function predictor and provide the convergence rate. An algorithm based on nuclear norm relaxation is proposed. A few numerical examples are presented to show the smaller mean squared prediction error comparing with the elementwise univariate kernel ridge regression.
161 - Wenjia Wang , Bing-Yi Jing 2021
In this work, we investigate Gaussian process regression used to recover a function based on noisy observations. We derive upper and lower error bounds for Gaussian process regression with possibly misspecified correlation functions. The optimal convergence rate can be attained even if the smoothness of the imposed correlation function exceeds that of the true correlation function and the sampling scheme is quasi-uniform. As byproducts, we also obtain convergence rates of kernel ridge regression with misspecified kernel function, where the underlying truth is a deterministic function. The convergence rates of Gaussian process regression and kernel ridge regression are closely connected, which is aligned with the relationship between sample paths of Gaussian process and the corresponding reproducing kernel Hilbert space.
It can be argued that optimal prediction should take into account all available data. Therefore, to evaluate a prediction intervals performance one should employ conditional coverage probability, conditioning on all available observations. Focusing on a linear model, we derive the asymptotic distribution of the difference between the conditional coverage probability of a nominal prediction interval and the conditional coverage probability of a prediction interval obtained via a residual-based bootstrap. Applying this result, we show that a prediction interval generated by the residual-based bootstrap has approximately 50% probability to yield conditional under-coverage. We then develop a new bootstrap algorithm that generates a prediction interval that asymptotically controls both the conditional coverage probability as well as the possibility of conditional under-coverage. We complement the asymptotic results with several finite-sample simulations.
In this paper, we develop uniform inference methods for the conditional mode based on quantile regression. Specifically, we propose to estimate the conditional mode by minimizing the derivative of the estimated conditional quantile function defined by smoothing the linear quantile regression estimator, and develop two bootstrap methods, a novel pivotal bootstrap and the nonparametric bootstrap, for our conditional mode estimator. Building on high-dimensional Gaussian approximation techniques, we establish the validity of simultaneous confidence rectangles constructed from the two bootstrap methods for the conditional mode. We also extend the preceding analysis to the case where the dimension of the covariate vector is increasing with the sample size. Finally, we conduct simulation experiments and a real data analysis using U.S. wage data to demonstrate the finite sample performance of our inference method.
85 - Lee H. Dicker 2016
We study asymptotic minimax problems for estimating a $d$-dimensional regression parameter over spheres of growing dimension ($dto infty$). Assuming that the data follows a linear model with Gaussian predictors and errors, we show that ridge regression is asymptotically minimax and derive new closed form expressions for its asymptotic risk under squared-error loss. The asymptotic risk of ridge regression is closely related to the Stieltjes transform of the Marv{c}enko-Pastur distribution and the spectral distribution of the predictors from the linear model. Adaptive ridge estimators are also proposed (which adapt to the unknown radius of the sphere) and connections with equivariant estimation are highlighted. Our results are mostly relevant for asymptotic settings where the number of observations, $n$, is proportional to the number of predictors, that is, $d/ntorhoin(0,infty)$.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا