ﻻ يوجد ملخص باللغة العربية
We study gradient-based regularization methods for neural networks. We mainly focus on two regularization methods: the total variation and the Tikhonov regularization. Applying these methods is equivalent to using neural networks to solve some partial differential equations, mostly in high dimensions in practical applications. In this work, we introduce a general framework to analyze the generalization error of regularized networks. The error estimate relies on two assumptions on the approximation error and the quadrature error. Moreover, we conduct some experiments on the image classification tasks to show that gradient-based methods can significantly improve the generalization ability and adversarial robustness of neural networks. A graphical extension of the gradient-based methods are also considered in the experiments.
Today, various forms of neural networks are trained to perform approximation tasks in many fields. However, the estimates obtained are not fully understood on function space. Empirical results suggest that typical training algorithms favor regularize
Adaptive gradient methods such as Adam have gained increasing popularity in deep learning optimization. However, it has been observed that compared with (stochastic) gradient descent, Adam can converge to a different solution with a significantly wor
Weight initialization plays an important role in training neural networks and also affects tremendous deep learning applications. Various weight initialization strategies have already been developed for different activation functions with different n
This paper analyzes the generalization error of two-layer neural networks for computing the ground state of the Schrodinger operator on a $d$-dimensional hypercube. We prove that the convergence rate of the generalization error is independent of the
We study the expressive power of deep ReLU neural networks for approximating functions in dilated shift-invariant spaces, which are widely used in signal processing, image processing, communications and so on. Approximation error bounds are estimated