No Arabic abstract
In this paper, we present a deep autoencoder based energy method (DAEM) for the bending, vibration and buckling analysis of Kirchhoff plates. The DAEM exploits the higher order continuity of the DAEM and integrates a deep autoencoder and the minimum total potential principle in one framework yielding an unsupervised feature learning method. The DAEM is a specific type of feedforward deep neural network (DNN) and can also serve as function approximator. With robust feature extraction capacity, the DAEM can more efficiently identify patterns behind the whole energy system, such as the field variables, natural frequency and critical buckling load factor studied in this paper. The objective function is to minimize the total potential energy. The DAEM performs unsupervised learning based on random generated points inside the physical domain so that the total potential energy is minimized at all points. For vibration and buckling analysis, the loss function is constructed based on Rayleighs principle and the fundamental frequency and the critical buckling load is extracted. A scaled hyperbolic tangent activation function for the underlying mechanical model is presented which meets the continuity requirement and alleviates the gradient vanishing/explosive problems under bending analysis. The DAEM can be easily implemented and we employed the Pytorch library and the LBFGS optimizer. A comprehensive study of the DAEM configuration is performed for several numerical examples with various geometries, load conditions, and boundary conditions.
In this paper, a deep collocation method (DCM) for thin plate bending problems is proposed. This method takes advantage of computational graphs and backpropagation algorithms involved in deep learning. Besides, the proposed DCM is based on a feedforward deep neural network (DNN) and differs from most previous applications of deep learning for mechanical problems. First, batches of randomly distributed collocation points are initially generated inside the domain and along the boundaries. A loss function is built with the aim that the governing partial differential equations (PDEs) of Kirchhoff plate bending problems, and the boundary/initial conditions are minimised at those collocation points. A combination of optimizers is adopted in the backpropagation process to minimize the loss function so as to obtain the optimal hyperparameters. In Kirchhoff plate bending problems, the C1 continuity requirement poses significant difficulties in traditional mesh-based methods. This can be solved by the proposed DCM, which uses a deep neural network to approximate the continuous transversal deflection, and is proved to be suitable to the bending analysis of Kirchhoff plate of various geometries.
This paper studies an unsupervised deep learning-based numerical approach for solving partial differential equations (PDEs). The approach makes use of the deep neural network to approximate solutions of PDEs through the compositional construction and employs least-squares functionals as loss functions to determine parameters of the deep neural network. There are various least-squares functionals for a partial differential equation. This paper focuses on the so-called first-order system least-squares (FOSLS) functional studied in [3], which is based on a first-order system of scalar second-order elliptic PDEs. Numerical results for second-order elliptic PDEs in one dimension are presented.
Finding parameters in a deep neural network (NN) that fit training data is a nonconvex optimization problem, but a basic first-order optimization method (gradient descent) finds a global solution with perfect fit in many practical situations. We examine this phenomenon for the case of Residual Neural Networks (ResNet) with smooth activation functions in a limiting regime in which both the number of layers (depth) and the number of neurons in each layer (width) go to infinity. First, we use a mean-field-limit argument to prove that the gradient descent for parameter training becomes a partial differential equation (PDE) that characterizes gradient flow for a probability distribution in the large-NN limit. Next, we show that the solution to the PDE converges in the training time to a zero-loss solution. Together, these results imply that training of the ResNet also gives a near-zero loss if the Resnet is large enough. We give estimates of the depth and width needed to reduce the loss below a given threshold, with high probability.
A local discontinuous Galerkin (LDG) method for approximating large deformations of prestrained plates is introduced and tested on several insightful numerical examples in our previous computational work. This paper presents a numerical analysis of this LDG method, focusing on the free boundary case. The problem consists of minimizing a fourth order bending energy subject to a nonlinear and nonconvex metric constraint. The energy is discretized using LDG and a discrete gradient flow is used for computing discrete minimizers. We first show $Gamma$-convergence of the discrete energy to the continuous one. Then we prove that the discrete gradient flow decreases the energy at each step and computes discrete minimizers with control of the metric constraint defect. We also present a numerical scheme for initialization of the gradient flow, and discuss the conditional stability of it.
We study gradient-based regularization methods for neural networks. We mainly focus on two regularization methods: the total variation and the Tikhonov regularization. Applying these methods is equivalent to using neural networks to solve some partial differential equations, mostly in high dimensions in practical applications. In this work, we introduce a general framework to analyze the generalization error of regularized networks. The error estimate relies on two assumptions on the approximation error and the quadrature error. Moreover, we conduct some experiments on the image classification tasks to show that gradient-based methods can significantly improve the generalization ability and adversarial robustness of neural networks. A graphical extension of the gradient-based methods are also considered in the experiments.