ﻻ يوجد ملخص باللغة العربية
In this paper, we construct neural networks with ReLU, sine and $2^x$ as activation functions. For general continuous $f$ defined on $[0,1]^d$ with continuity modulus $omega_f(cdot)$, we construct ReLU-sine-$2^x$ networks that enjoy an approximation rate $mathcal{O}(omega_f(sqrt{d})cdot2^{-M}+omega_{f}left(frac{sqrt{d}}{N}right))$, where $M,Nin mathbb{N}^{+}$ denote the hyperparameters related to widths of the networks. As a consequence, we can construct ReLU-sine-$2^x$ network with the depth $5$ and width $maxleft{leftlceil2d^{3/2}left(frac{3mu}{epsilon}right)^{1/{alpha}}rightrceil,2leftlceillog_2frac{3mu d^{alpha/2}}{2epsilon}rightrceil+2right}$ that approximates $fin mathcal{H}_{mu}^{alpha}([0,1]^d)$ within a given tolerance $epsilon >0$ measured in $L^p$ norm $pin[1,infty)$, where $mathcal{H}_{mu}^{alpha}([0,1]^d)$ denotes the Holder continuous function class defined on $[0,1]^d$ with order $alpha in (0,1]$ and constant $mu > 0$. Therefore, the ReLU-sine-$2^x$ networks overcome the curse of dimensionality on $mathcal{H}_{mu}^{alpha}([0,1]^d)$. In addition to its supper expressive power, functions implemented by ReLU-sine-$2^x$ networks are (generalized) differentiable, enabling us to apply SGD to train.
We investigate the parameter-space geometry of recurrent neural networks (RNNs), and develop an adaptation of path-SGD optimization method, attuned to this geometry, that can learn plain RNNs with ReLU activations. On several datasets that require ca
We study the expressive power of deep ReLU neural networks for approximating functions in dilated shift-invariant spaces, which are widely used in signal processing, image processing, communications and so on. Approximation error bounds are estimated
Neural networks have been widely used to solve complex real-world problems. Due to the complicate, nonlinear, non-convex nature of neural networks, formal safety guarantees for the output behaviors of neural networks will be crucial for their applica
We present polynomial time and sample efficient algorithms for learning an unknown depth-2 feedforward neural network with general ReLU activations, under mild non-degeneracy assumptions. In particular, we consider learning an unknown network of the
The number of linear regions is one of the distinct properties of the neural networks using piecewise linear activation functions such as ReLU, comparing with those conventional ones using other activation functions. Previous studies showed this prop