Least-Squares ReLU Neural Network (LSNN) Method For Scalar Nonlinear Hyperbolic Conservation Law

92 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Jingshuang Chen

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Zhiqiang Cai - Jingshuang Chen - Min Liu

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We introduced the least-squares ReLU neural network (LSNN) method for solving the linear advection-reaction problem with discontinuous solution and showed that the method outperforms mesh-based numerical methods in terms of the number of degrees of freedom. This paper studies the LSNN method for scalar nonlinear hyperbolic conservation law. The method is a discretization of an equivalent least-squares (LS) formulation in the set of neural network functions with the ReLU activation function. Evaluation of the LS functional is done by using numerical integration and conservative finite volume scheme. Numerical results of some test problems show that the method is capable of approximating the discontinuous interface of the underlying problem automatically through the free breaking lines of the ReLU neural network. Moreover, the method does not exhibit the common Gibbs phenomena along the discontinuous interface.

قيم البحث

174 - Zhiqiang Cai , Jingshuang Chen , Min Liu 2021

This paper studies least-squares ReLU neural network method for solving the linear advection-reaction problem with discontinuous solution. The method is a discretization of an equivalent least-squares formulation in the set of neural network function s with the ReLU activation function. The method is capable of approximating the discontinuous interface of the underlying problem automatically through the free hyper-planes of the ReLU neural network and, hence, outperforms mesh-based numerical methods in terms of the number of degrees of freedom. Numerical results of some benchmark test problems show that the method can not only approximate the solution with the least number of parameters, but also avoid the common Gibbs phenomena along the discontinuous interface. Moreover, a three-layer ReLU neural network is necessary and sufficient in order to well approximate a discontinuous solution with an interface in $mathbb{R}^2$ that is not a straight line.

التحليل العددي التعلم الآلي التحليل العددي

Adaptive Two-Layer ReLU Neural Network: I. Best Least-squares Approximation

182 - Min Liu , Zhiqiang Cai , Jingshuang Chen 2021

In this paper, we introduce adaptive neuron enhancement (ANE) method for the best least-squares approximation using two-layer ReLU neural networks (NNs). For a given function f(x), the ANE method generates a two-layer ReLU NN and a numerical integrat ion mesh such that the approximation accuracy is within the prescribed tolerance. The ANE method provides a natural process for obtaining a good initialization which is crucial for training nonlinear optimization problems. Numerical results of the ANE method are presented for functions of two variables exhibiting either intersecting interface singularities or sharp interior layers.

التحليل العددي التحليل العددي

Convergence bounds for nonlinear least squares and applications to tensor recovery

97 - Philipp Trunschke 2021

We consider the problem of approximating a function in general nonlinear subsets of $L^2$ when only a weighted Monte Carlo estimate of the $L^2$-norm can be computed. Of particular interest in this setting is the concept of sample complexity, the num ber of samples that are necessary to recover the best approximation. Bounds for this quantity have been derived in a previous work and depend primarily on the model class and are not influenced positively by the regularity of the sought function. This result however is only a worst-case bound and is not able to explain the remarkable performance of iterative hard thresholding algorithms that is observed in practice. We reexamine the results of the previous paper and derive a new bound that is able to utilize the regularity of the sought function. A critical analysis of our results allows us to derive a sample efficient algorithm for the model set of low-rank tensors. The viability of this algorithm is demonstrated by recovering quantities of interest for a classical high-dimensional random partial differential equation.

التحليل العددي التعلم الآلي التحليل العددي

Convergence bounds for empirical nonlinear least-squares

76 - Martin Eigel , Reinhold Schneider , Philipp Trunschke 2020

We consider best approximation problems in a nonlinear subset $mathcal{M}$ of a Banach space of functions $(mathcal{V},|bullet|)$. The norm is assumed to be a generalization of the $L^2$-norm for which only a weighted Monte Carlo estimate $|bullet|_n $ can be computed. The objective is to obtain an approximation $vinmathcal{M}$ of an unknown function $u in mathcal{V}$ by minimizing the empirical norm $|u-v|_n$. We consider this problem for general nonlinear subsets and establish error bounds for the empirical best approximation error. Our results are based on a restricted isometry property (RIP) which holds in probability and is independent of the nonlinear least squares setting. Several model classes are examined where analytical statements can be made about the RIP and the results are compared to existing sample complexity bounds from the literature. We find that for well-studied model classes our general bound is weaker but exhibits many of the same properties as these specialized bounds. Notably, we demonstrate the advantage of an optimal sampling density (as known for linear spaces) for sets of functions with sparse representations.

التحليل العددي التحليل العددي الاحتمالات

Windowed space-time least-squares Petrov-Galerkin method for nonlinear model order reduction

91 - Yukiko S. Shimizu , Eric J. Parish 2020

This work presents the windowed space-time least-squares Petrov-Galerkin method (WST-LSPG) for model reduction of nonlinear parameterized dynamical systems. WST-LSPG is a generalization of the space-time least-squares Petrov-Galerkin method (ST-LSPG) . The main drawback of ST-LSPG is that it requires solving a dense space-time system with a space-time basis that is calculated over the entire global time domain, which can be unfeasible for large-scale applications. Instead of using a temporally-global space-time trial subspace and minimizing the discrete-in-time full-order model (FOM) residual over an entire time domain, the proposed WST-LSPG approach addresses this weakness by (1) dividing the time simulation into time windows, (2) devising a unique low-dimensional space-time trial subspace for each window, and (3) minimizing the discrete-in-time space-time residual of the dynamical system over each window. This formulation yields a problem with coupling confined within each window, but sequential across the windows. To enable high-fidelity trial subspaces characterized by a relatively minimal number of basis vectors, this work proposes constructing space-time bases using tensor decompositions for each window. WST-LSPG is equipped with hyper-reduction techniques to further reduce the computational cost. Numerical experiments for the one-dimensional Burgers equation and the two-dimensional compressible Navier-Stokes equations for flow over a NACA 0012 airfoil demonstrate that WST-LSPG is superior to ST-LSPG in terms of accuracy and computational gain.

التحليل العددي التحليل العددي