ﻻ يوجد ملخص باللغة العربية
In this paper, we study the problem of early stopping for iterative learning algorithms in a reproducing kernel Hilbert space (RKHS) in the nonparametric regression framework. In particular, we work with the gradient descent and (iterative) kernel ridge regression algorithms. We present a data-driven rule to perform early stopping without a validation set that is based on the so-called minimum discrepancy principle. This method enjoys only one assumption on the regression function: it belongs to a reproducing kernel Hilbert space (RKHS). The proposed rule is proved to be minimax-optimal over different types of kernel spaces, including finite-rank and Sobolev smoothness classes. The proof is derived from the fixed-point analysis of the localized Rademacher complexities, which is a standard technique for obtaining optimal rates in the nonparametric regression literature. In addition to that, we present simulation results on artificial datasets that show the comparable performance of the designed rule with respect to other stopping rules such as the one determined by V-fold cross-validation.
We show that unconverged stochastic gradient descent can be interpreted as a procedure that samples from a nonparametric variational approximate posterior distribution. This distribution is implicitly defined as the transformation of an initial distr
We show that minimum-norm interpolation in the Reproducing Kernel Hilbert Space corresponding to the Laplace kernel is not consistent if input dimension is constant. The lower bound holds for any choice of kernel bandwidth, even if selected based on
This paper studies binary logistic regression for rare events data, or imbalanced data, where the number of events (observations in one class, often called cases) is significantly smaller than the number of nonevents (observations in the other class,
In this paper, we study the implicit bias of gradient descent for sparse regression. We extend results on regression with quadratic parametrization, which amounts to depth-2 diagonal linear networks, to more general depth-N networks, under more reali
Systems of interacting particles or agents have wide applications in many disciplines such as Physics, Chemistry, Biology and Economics. These systems are governed by interaction laws, which are often unknown: estimating them from observation data is