Two-Layer Neural Networks for Partial Differential Equations: Optimization and Generalization Theory

78 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Haizhao Yang

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Tao Luo - Haizhao Yang

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The problem of solving partial differential equations (PDEs) can be formulated into a least-squares minimization problem, where neural networks are used to parametrize PDE solutions. A global minimizer corresponds to a neural network that solves the given PDE. In this paper, we show that the gradient descent method can identify a global minimizer of the least-squares optimization for solving second-order linear PDEs with two-layer neural networks under the assumption of over-parametrization. We also analyze the generalization error of the least-squares optimization for second-order linear PDEs and two-layer neural networks, when the right-hand-side function of the PDE is in a Barron-type space and the least-squares optimization is regularized with a Barron-type norm, without the over-parametrization assumption.

قيم البحث

88 - Suprosanna Shit , Ivan Ezhov , Leon Machler 2021

Fast and accurate solutions of time-dependent partial differential equations (PDEs) are of pivotal interest to many research fields, including physics, engineering, and biology. Generally, implicit/semi-implicit schemes are preferred over explicit on es to improve stability and correctness. However, existing semi-implicit methods are usually iterative and employ a general-purpose solver, which may be sub-optimal for a specific class of PDEs. In this paper, we propose a neural solver to learn an optimal iterative scheme in a data-driven fashion for any class of PDEs. Specifically, we modify a single iteration of a semi-implicit solver using a deep neural network. We provide theoretical guarantees for the correctness and convergence of neural solvers analogous to conventional iterative solvers. In addition to the commonly used Dirichlet boundary condition, we adopt a diffuse domain approach to incorporate a diverse type of boundary conditions, e.g., Neumann. We show that the proposed neural solver can go beyond linear PDEs and applies to a class of non-linear PDEs, where the non-linear component is non-stiff. We demonstrate the efficacy of our method on 2D and 3D scenarios. To this end, we show how our model generalizes to parameter settings, which are different from training; and achieves faster convergence than semi-implicit schemes.

التحليل العددي التعلم الآلي التحليل العددي

Weak SINDy For Partial Differential Equations

77 - Daniel A. Messenger , David M. Bortz 2020

Sparse Identification of Nonlinear Dynamics (SINDy) is a method of system discovery that has been shown to successfully recover governing dynamical systems from data (Brunton et al., PNAS, 16; Rudy et al., Sci. Adv. 17). Recently, several groups have independently discovered that the weak formulation provides orders of magnitude better robustness to noise. Here we extend our Weak SINDy (WSINDy) framework introduced in (arXiv:2005.04339) to the setting of partial differential equations (PDEs). The elimination of pointwise derivative approximations via the weak form enables effective machine-precision recovery of model coefficients from noise-free data (i.e. below the tolerance of the simulation scheme) as well as robust identification of PDEs in the large noise regime (with signal-to-noise ratio approaching one in many well-known cases). This is accomplished by discretizing a convolutional weak form of the PDE and exploiting separability of test functions for efficient model identification using the Fast Fourier Transform. The resulting WSINDy algorithm for PDEs has a worst-case computational complexity of $mathcal{O}(N^{D+1}log(N))$ for datasets with $N$ points in each of $D+1$ dimensions (i.e. $mathcal{O}(log(N))$ operations per datapoint). Furthermore, our Fourier-based implementation reveals a connection between robustness to noise and the spectra of test functions, which we utilize in an textit{a priori} selection algorithm for test functions. Finally, we introduce a learning algorithm for the threshold in sequential-thresholding least-squares (STLS) that enables model identification from large libraries, and we utilize scale-invariance at the continuum level to identify PDEs from poorly-scaled datasets. We demonstrate WSINDys robustness, speed and accuracy on several challenging PDEs.

التحليل العددي التعلم الآلي التحليل العددي

Neural Time-Dependent Partial Differential Equation

240 - Yihao Hu , Tong Zhao , Zhiliang Xu 2020

Partial differential equations (PDEs) play a crucial role in studying a vast number of problems in science and engineering. Numerically solving nonlinear and/or high-dimensional PDEs is often a challenging task. Inspired by the traditional finite dif ference and finite elements methods and emerging advancements in machine learning, we propose a sequence deep learning framework called Neural-PDE, which allows to automatically learn governing rules of any time-dependent PDE system from existing data by using a bidirectional LSTM encoder, and predict the next n time steps data. One critical feature of our proposed framework is that the Neural-PDE is able to simultaneously learn and simulate the multiscale variables.We test the Neural-PDE by a range of examples from one-dimensional PDEs to a high-dimensional and nonlinear complex fluids model. The results show that the Neural-PDE is capable of learning the initial conditions, boundary conditions and differential operators without the knowledge of the specific form of a PDE system.In our experiments the Neural-PDE can efficiently extract the dynamics within 20 epochs training, and produces accurate predictions. Furthermore, unlike the traditional machine learning approaches in learning PDE such as CNN and MLP which require vast parameters for model precision, Neural-PDE shares parameters across all time steps, thus considerably reduces the computational complexity and leads to a fast learning algorithm.

التحليل العددي التعلم الآلي التحليل العددي

Stiff Neural Ordinary Differential Equations

99 - Suyong Kim , Weiqi Ji , Sili Deng 2021

Neural Ordinary Differential Equations (ODE) are a promising approach to learn dynamic models from time-series data in science and engineering applications. This work aims at learning Neural ODE for stiff systems, which are usually raised from chemic al kinetic modeling in chemical and biological systems. We first show the challenges of learning neural ODE in the classical stiff ODE systems of Robertsons problem and propose techniques to mitigate the challenges associated with scale separations in stiff systems. We then present successful demonstrations in stiff systems of Robertsons problem and an air pollution problem. The demonstrations show that the usage of deep networks with rectified activations, proper scaling of the network outputs as well as loss functions, and stabilized gradient calculations are the key techniques enabling the learning of stiff neural ODE. The success of learning stiff neural ODE opens up possibilities of using neural ODEs in applications with widely varying time-scales, like chemical dynamics in energy conversion, environmental engineering, and the life sciences.

التحليل العددي التعلم الآلي التحليل العددي

Local Extreme Learning Machines and Domain Decomposition for Solving Linear and Nonlinear Partial Differential Equations

116 - Suchuan Dong , Zongwei Li 2020

We present a neural network-based method for solving linear and nonlinear partial differential equations, by combining the ideas of extreme learning machines (ELM), domain decomposition and local neural networks. The field solution on each sub-domain is represented by a local feed-forward neural network, and $C^k$ continuity is imposed on the sub-domain boundaries. Each local neural network consists of a small number of hidden layers, while its last hidden layer can be wide. The weight/bias coefficients in all hidden layers of the local neural networks are pre-set to random values and are fixed, and only the weight coefficients in the output layers are training parameters. The overall neural network is trained by a linear or nonlinear least squares computation, not by the back-propagation type algorithms. We introduce a block time-marching scheme together with the presented method for long-time dynamic simulations. The current method exhibits a clear sense of convergence with respect to the degrees of freedom in the neural network. Its numerical errors typically decrease exponentially or nearly exponentially as the number of degrees of freedom increases. Extensive numerical experiments have been performed to demonstrate the computational performance of the presented method. We compare the current method with the deep Galerkin method (DGM) and the physics-informed neural network (PINN) in terms of the accuracy and computational cost. The current method exhibits a clear superiority, with its numerical errors and network training time considerably smaller (typically by orders of magnitude) than those of DGM and PINN. We also compare the current method with the classical finite element method (FEM). The computational performance of the current method is on par with, and oftentimes exceeds, the FEM performance.

التحليل العددي التعلم الآلي التحليل العددي