ﻻ يوجد ملخص باللغة العربية
Interpolators -- estimators that achieve zero training error -- have attracted growing attention in machine learning, mainly because state-of-the art neural networks appear to be models of this type. In this paper, we study minimum $ell_2$ norm (``ridgeless) interpolation in high-dimensional least squares regression. We consider two different models for the feature distribution: a linear model, where the feature vectors $x_i in {mathbb R}^p$ are obtained by applying a linear transform to a vector of i.i.d. entries, $x_i = Sigma^{1/2} z_i$ (with $z_i in {mathbb R}^p$); and a nonlinear model, where the feature vectors are obtained by passing the input through a random one-layer neural network, $x_i = varphi(W z_i)$ (with $z_i in {mathbb R}^d$, $W in {mathbb R}^{p times d}$ a matrix of i.i.d. entries, and $varphi$ an activation function acting componentwise on $W z_i$). We recover -- in a precise quantitative way -- several phenomena that have been observed in large-scale neural networks and kernel machines, including the double descent behavior of the prediction risk, and the potential benefits of overparametrization.
We study the problem of exact support recovery based on noisy observations and present Refined Least Squares (RLS). Given a set of noisy measurement $$ myvec{y} = myvec{X}myvec{theta}^* + myvec{omega},$$ and $myvec{X} in mathbb{R}^{N times D}$ which
We show that the high-dimensional behavior of symmetrically penalized least squares with a possibly non-separable, symmetric, convex penalty in both (i) the Gaussian sequence model and (ii) the linear model with uncorrelated Gaussian designs nearly m
In model selection, several types of cross-validation are commonly used and many variants have been introduced. While consistency of some of these methods has been proven, their rate of convergence to the oracle is generally still unknown. Until now,
We present the first provable Least-Squares Value Iteration (LSVI) algorithms that have runtime complexity sublinear in the number of actions. We formulate the value function estimation procedure in value iteration as an approximate maximum inner pro
Penalization procedures often suffer from their dependence on multiplying factors, whose optimal values are either unknown or hard to estimate from the data. We propose a completely data-driven calibration algorithm for this parameter in the least-sq