ﻻ يوجد ملخص باللغة العربية
The divide-and-conquer method has been widely used for estimating large-scale kernel ridge regression estimates. Unfortunately, when the response variable is highly skewed, the divide-and-conquer kernel ridge regression (dacKRR) may overlook the underrepresented region and result in unacceptable results. We develop a novel response-adaptive partition strategy to overcome the limitation. In particular, we propose to allocate the replicates of some carefully identified informative observations to multiple nodes (local processors). The idea is analogous to the popular oversampling technique. Although such a technique has been widely used for addressing discrete label skewness, extending it to the dacKRR setting is nontrivial. We provide both theoretical and practical guidance on how to effectively over-sample the observations under the dacKRR setting. Furthermore, we show the proposed estimate has a smaller asymptotic mean squared error (AMSE) than that of the classical dacKRR estimate under mild conditions. Our theoretical findings are supported by both simulated and real-data analyses.
Nystrom approximation is a fast randomized method that rapidly solves kernel ridge regression (KRR) problems through sub-sampling the n-by-n empirical kernel matrix appearing in the objective function. However, the performance of such a sub-sampling
The sparse factorization of a large matrix is fundamental in modern statistical learning. In particular, the sparse singular value decomposition and its variants have been utilized in multivariate regression, factor analysis, biclustering, vector tim
Building a sketch of an n-by-n empirical kernel matrix is a common approach to accelerate the computation of many kernel methods. In this paper, we propose a unified framework of constructing sketching methods in kernel ridge regression (KRR), which
This paper carries out a large dimensional analysis of a variation of kernel ridge regression that we call emph{centered kernel ridge regression} (CKRR), also known in the literature as kernel ridge regression with offset. This modified technique is
Understanding generalization and estimation error of estimators for simple models such as linear and generalized linear models has attracted a lot of attention recently. This is in part due to an interesting observation made in machine learning commu