ﻻ يوجد ملخص باللغة العربية
We consider the variation space corresponding to a dictionary of functions in $L^2(Omega)$ and present the basic theory of approximation in these spaces. Specifically, we compare the definition based on integral representations with the definition in terms of convex hulls. We show that in many cases, including the dictionaries corresponding to shallow ReLU$^k$ networks and a dictionary of decaying Fourier modes, that the two definitions coincide. We also give a partial characterization of the variation space for shallow ReLU$^k$ networks and show that the variation space with respect to the dictionary of decaying Fourier modes corresponds to the Barron spectral space.
We consider the approximation rates of shallow neural networks with respect to the variation norm. Upper bounds on these rates have been established for sigmoidal and ReLU activation functions, but it has remained an important open problem whether th
We consider the teacher-student setting of learning shallow neural networks with quadratic activations and planted weight matrix $W^*inmathbb{R}^{mtimes d}$, where $m$ is the width of the hidden layer and $dle m$ is the data dimension. We study the o
We consider shallow (single hidden layer) neural networks and characterize their performance when trained with stochastic gradient descent as the number of hidden units $N$ and gradient descent steps grow to infinity. In particular, we investigate th
Dropout is a regularisation technique in neural network training where unit activations are randomly set to zero with a given probability emph{independently}. In this work, we propose a generalisation of dropout and other multiplicative noise injecti
Neural networks have excelled at regression and classification problems when the input space consists of scalar variables. As a result of this proficiency, several popular packages have been developed that allow users to easily fit these kinds of mod