ﻻ يوجد ملخص باللغة العربية
We establish in this work approximation results of deep neural networks for smooth functions measured in Sobolev norms, motivated by recent development of numerical solvers for partial differential equations using deep neural networks. The error bounds are explicitly characterized in terms of both the width and depth of the networks simultaneously. Namely, for $fin C^s([0,1]^d)$, we show that deep ReLU networks of width $mathcal{O}(Nlog{N})$ and of depth $mathcal{O}(Llog{L})$ can achieve a non-asymptotic approximation rate of $mathcal{O}(N^{-2(s-1)/d}L^{-2(s-1)/d})$ with respect to the $mathcal{W}^{1,p}([0,1]^d)$ norm for $pin[1,infty)$. If either the ReLU function or its square is applied as activation functions to construct deep neural networks of width $mathcal{O}(Nlog{N})$ and of depth $mathcal{O}(Llog{L})$ to approximate $fin C^s([0,1]^d)$, the non-asymptotic approximation rate is $mathcal{O}(N^{-2(s-n)/d}L^{-2(s-n)/d})$ with respect to the $mathcal{W}^{n,p}([0,1]^d)$ norm for $pin[1,infty)$.
Artificial neural networks (ANNs) have become a very powerful tool in the approximation of high-dimensional functions. Especially, deep ANNs, consisting of a large number of hidden layers, have been very successfully used in a series of practical rel
We prove that a variant of the classical Sobolev space of first-order dominating mixed smoothness is equivalent (under a certain condition) to the unanchored ANOVA space on $mathbb{R}^d$, for $d geq 1$. Both spaces are Hilbert spaces involving weight
Neural Networks (NNs) are the method of choice for building learning algorithms. Their popularity stems from their empirical success on several challenging learning problems. However, most scholars agree that a convincing theoretical explanation for
In this paper, we develop a new neural network family based on power series expansion, which is proved to achieve a better approximation accuracy in comparison with existing neural networks. This new set of neural networks embeds the power series exp
Deep learning is a powerful tool for solving nonlinear differential equations, but usually, only the solution corresponding to the flattest local minimizer can be found due to the implicit regularization of stochastic gradient descent. This paper pro