ﻻ يوجد ملخص باللغة العربية
Modern machine learning often operates in the regime where the number of parameters is much higher than the number of data points, with zero training loss and yet good generalization, thereby contradicting the classical bias-variance trade-off. This textit{benign overfitting} phenomenon has recently been characterized using so called textit{double descent} curves where the risk undergoes another descent (in addition to the classical U-shaped learning curve when the number of parameters is small) as we increase the number of parameters beyond a certain threshold. In this paper, we examine the conditions under which textit{Benign Overfitting} occurs in the random feature (RF) models, i.e. in a two-layer neural network with fixed first layer weights. We adopt a new view of random feature and show that textit{benign overfitting} arises due to the noise which resides in such features (the noise may already be present in the data and propagate to the features or it may be added by the user to the features directly) and plays an important implicit regularization role in the phenomenon.
We consider interpolation learning in high-dimensional linear regression with Gaussian data, and prove a generic uniform convergence guarantee on the generalization error of interpolators in an arbitrary hypothesis class in terms of the classs Gaussi
Deep neural networks generalize well despite being exceedingly overparameterized and being trained without explicit regularization. This curious phenomenon has inspired extensive research activity in establishing its statistical principles: Under wha
The growing literature on benign overfitting in overparameterized models has been mostly restricted to regression or binary classification settings; however, most success stories of modern machine learning have been recorded in multiclass settings. M
We introduce a new general identifiable framework for principled disentanglement referred to as Structured Nonlinear Independent Component Analysis (SNICA). Our contribution is to extend the identifiability theory of deep generative models for a very
We investigate under and overfitting in Generative Adversarial Networks (GANs), using discriminators unseen by the generator to measure generalization. We find that the model capacity of the discriminator has a significant effect on the generators mo