An Upper Limit of Decaying Rate with Respect to Frequency in Deep Neural Network

114 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Zhiwei Wang

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Tao Luo - Zheng Ma - Zhiwei Wang

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Deep neural network (DNN) usually learns the target function from low to high frequency, which is called frequency principle or spectral bias. This frequency principle sheds light on a high-frequency curse of DNNs -- difficult to learn high-frequency information. Inspired by the frequency principle, a series of works are devoted to develop algorithms for overcoming the high-frequency curse. A natural question arises: what is the upper limit of the decaying rate w.r.t. frequency when one trains a DNN? In this work, our theory, confirmed by numerical experiments, suggests that there is a critical decaying rate w.r.t. frequency in DNN training. Below the upper limit of the decaying rate, the DNN interpolates the training data by a function with a certain regularity. However, above the upper limit, the DNN interpolates the training data by a trivial function, i.e., a function is only non-zero at training data points. Our results indicate a better way to overcome the high-frequency curse is to design a proper pre-condition approach to shift high-frequency information to low-frequency one, which coincides with several previous developed algorithms for fast learning high-frequency information. More importantly, this work rigorously proves that the high-frequency curse is an intrinsic difficulty of DNNs.

قيم البحث

91 - Yunfei Yang , Zhen Li , Yang Wang 2020

We study the expressive power of deep ReLU neural networks for approximating functions in dilated shift-invariant spaces, which are widely used in signal processing, image processing, communications and so on. Approximation error bounds are estimated with respect to the width and depth of neural networks. The network construction is based on the bit extraction and data-fitting capacity of deep neural networks. As applications of our main results, the approximation rates of classical function spaces such as Sobolev spaces and Besov spaces are obtained. We also give lower bounds of the $L^p (1le p le infty)$ approximation error for Sobolev spaces, which show that our construction of neural network is asymptotically optimal up to a logarithmic factor.

التعلم الآلي التحليل العددي التحليل العددي

IDRLnet: A Physics-Informed Neural Network Library

90 - Wei Peng , Jun Zhang , Weien Zhou 2021

Physics Informed Neural Network (PINN) is a scientific computing framework used to solve both forward and inverse problems modeled by Partial Differential Equations (PDEs). This paper introduces IDRLnet, a Python toolbox for modeling and solving prob lems through PINN systematically. IDRLnet constructs the framework for a wide range of PINN algorithms and applications. It provides a structured way to incorporate geometric objects, data sources, artificial neural networks, loss metrics, and optimizers within Python. Furthermore, it provides functionality to solve noisy inverse problems, variational minimization, and integral differential equations. New PINN variants can be integrated into the framework easily. Source code, tutorials, and documentation are available at url{https://github.com/idrl-lab/idrlnet}.

التعلم الآلي التحليل العددي التحليل العددي

Batch Normalization Preconditioning for Neural Network Training

84 - Susanna Lange , Kyle Helfrich , Qiang Ye 2021

Batch normalization (BN) is a popular and ubiquitous method in deep learning that has been shown to decrease training time and improve generalization performance of neural networks. Despite its success, BN is not theoretically well understood. It is not suitable for use with very small mini-batch sizes or online learning. In this paper, we propose a new method called Batch Normalization Preconditioning (BNP). Instead of applying normalization explicitly through a batch normalization layer as is done in BN, BNP applies normalization by conditioning the parameter gradients directly during training. This is designed to improve the Hessian matrix of the loss function and hence convergence during training. One benefit is that BNP is not constrained on the mini-batch size and works in the online learning setting. Furthermore, its connection to BN provides theoretical insights on how BN improves training and how BN is applied to special architectures such as convolutional neural networks.

التعلم الآلي التحليل العددي التحليل العددي

A monotone scheme for G-equations with application to the explicit convergence rate of robust central limit theorem

111 - Shuo Huang , Gechun Liang 2019

We propose a monotone approximation scheme for a class of fully nonlinear PDEs called G-equations. Such equations arise often in the characterization of G-distributed random variables in a sublinear expectation space. The proposed scheme is construct ed recursively based on a piecewise constant approximation of the viscosity solution to the G-equation. We establish the convergence of the scheme and determine the convergence rate with an explicit error bound, using the comparison principles for both the scheme and the equation together with a mollification procedure. The first application is obtaining the convergence rate of Pengs robust central limit theorem with an explicit bound of Berry-Esseen type. The second application is an approximation scheme with its convergence rate for the Black-Scholes-Barenblatt equation.

الاحتمالات التحليل العددي التحليل العددي

Physics-Enforced Modeling for Insertion Loss of Transmission Lines by Deep Neural Networks

76 - Liang Chen , Lesley Tan 2021

In this paper, we investigate data-driven parameterized modeling of insertion loss for transmission lines with respect to design parameters. We first show that direct application of neural networks can lead to non-physics models with negative inserti on loss. To mitigate this problem, we propose two deep learning solutions. One solution is to add a regulation term, which represents the passive condition, to the final loss function to enforce the negative quantity of insertion loss. In the second method, a third-order polynomial expression is defined first, which ensures positiveness, to approximate the insertion loss, then DeepONet neural network structure, which was proposed recently for function and system modeling, was employed to model the coefficients of polynomials. The resulting neural network is applied to predict the coefficients of the polynomial expression. The experimental results on an open-sourced SI/PI database of a PCB design show that both methods can ensure the positiveness for the insertion loss. Furthermore, both methods can achieve similar prediction results, while the polynomial-based DeepONet method is faster than DeepONet based method in training time.

التعلم الآلي التحليل العددي التحليل العددي