ترغب بنشر مسار تعليمي؟ اضغط هنا

A Modified Batch Intrinsic Plasticity Method for Pre-training the Random Coefficients of Extreme Learning Machines

58   0   0.0 ( 0 )
 نشر من قبل Suchuan Dong
 تاريخ النشر 2021
والبحث باللغة English




اسأل ChatGPT حول البحث

In extreme learning machines (ELM) the hidden-layer coefficients are randomly set and fixed, while the output-layer coefficients of the neural network are computed by a least squares method. The randomly-assigned coefficients in ELM are known to influence its performance and accuracy significantly. In this paper we present a modified batch intrinsic plasticity (modBIP) method for pre-training the random coefficients in the ELM neural networks. The current method is devised based on the same principle as the batch intrinsic plasticity (BIP) method, namely, by enhancing the information transmission in every node of the neural network. It differs from BIP in two prominent aspects. First, modBIP does not involve the activation function in its algorithm, and it can be applied with any activation function in the neural network. In contrast, BIP employs the inverse of the activation function in its construction, and requires the activation function to be invertible (or monotonic). The modBIP method can work with the often-used non-monotonic activation functions (e.g. Gaussian, swish, Gaussian error linear unit, and radial-basis type functions), with which BIP breaks down. Second, modBIP generates target samples on random intervals with a minimum size, which leads to highly accurate computation results when combined with ELM. The combined ELM/modBIP method is markedly more accurate than ELM/BIP in numerical simulations. Ample numerical experiments are presented with shallow and deep neural networks for function approximation and boundary/initial value problems with partial differential equations. They demonstrate that the combined ELM/modBIP method produces highly accurate simulation results, and that its accuracy is insensitive to the random-coefficient initializations in the neural network. This is in sharp contrast with the ELM results without pre-training of the random coefficients.



قيم البحث

اقرأ أيضاً

129 - Shi Jin , Lei Li , Zhenli Xu 2020
We develop a random batch Ewald (RBE) method for molecular dynamics simulations of particle systems with long-range Coulomb interactions, which achieves an $O(N)$ complexity in each step of simulating the $N$-body systems. The RBE method is based on the Ewald splitting for the Coulomb kernel with a random mini-batch type technique introduced to speed up the summation of the Fourier series for the long-range part of the splitting. Importance sampling is employed to reduce the induced force variance by taking advantage of the fast decay property of the Fourier coefficients. The stochastic approximation is unbiased with controlled variance. Analysis for bounded force fields gives some theoretic support of the method. Simulations of two typical problems of charged systems are presented to illustrate the accuracy and efficiency of the RBE method in comparison to the results from the Debye-Huckel theory and the classical Ewald summation, demonstrating that the proposed method has the attractiveness of being easy to implement with the linear scaling and is promising for many practical applications.
133 - Lei Li , Zhenli Xu , Yue Zhao 2020
We propose a fast potential splitting Markov Chain Monte Carlo method which costs $O(1)$ time each step for sampling from equilibrium distributions (Gibbs measures) corresponding to particle systems with singular interacting kernels. We decompose the interacting potential into two parts, one is of long range but is smooth, and the other one is of short range but may be singular. To displace a particle, we first evolve a selected particle using the stochastic differential equation (SDE) under the smooth part with the idea of random batches, as commonly used in stochastic gradient Langevin dynamics. Then, we use the short range part to do a Metropolis rejection. Different from the classical Langevin dynamics, we only run the SDE dynamics with random batch for a short duration of time so that the cost in the first step is $O(p)$, where $p$ is the batch size. The cost of the rejection step is $O(1)$ since the interaction used is of short range. We justify the proposed random-batch Monte Carlo method, which combines the random batch and splitting strategies, both in theory and with numerical experiments. While giving comparable results for typical examples of the Dyson Brownian motion and Lennard-Jones fluids, our method can save more time when compared to the classical Metropolis-Hastings algorithm.
116 - Suchuan Dong , Zongwei Li 2020
We present a neural network-based method for solving linear and nonlinear partial differential equations, by combining the ideas of extreme learning machines (ELM), domain decomposition and local neural networks. The field solution on each sub-domain is represented by a local feed-forward neural network, and $C^k$ continuity is imposed on the sub-domain boundaries. Each local neural network consists of a small number of hidden layers, while its last hidden layer can be wide. The weight/bias coefficients in all hidden layers of the local neural networks are pre-set to random values and are fixed, and only the weight coefficients in the output layers are training parameters. The overall neural network is trained by a linear or nonlinear least squares computation, not by the back-propagation type algorithms. We introduce a block time-marching scheme together with the presented method for long-time dynamic simulations. The current method exhibits a clear sense of convergence with respect to the degrees of freedom in the neural network. Its numerical errors typically decrease exponentially or nearly exponentially as the number of degrees of freedom increases. Extensive numerical experiments have been performed to demonstrate the computational performance of the presented method. We compare the current method with the deep Galerkin method (DGM) and the physics-informed neural network (PINN) in terms of the accuracy and computational cost. The current method exhibits a clear superiority, with its numerical errors and network training time considerably smaller (typically by orders of magnitude) than those of DGM and PINN. We also compare the current method with the classical finite element method (FEM). The computational performance of the current method is on par with, and oftentimes exceeds, the FEM performance.
80 - Suchuan Dong , Naxian Ni 2020
We present a simple and effective method for representing periodic functions and enforcing exactly the periodic boundary conditions for solving differential equations with deep neural networks (DNN). The method stems from some simple properties about function compositions involving periodic functions. It essentially composes a DNN-represented arbitrary function with a set of independent periodic functions with adjustable (training) parameters. We distinguish two types of periodic conditions: those imposing the periodicity requirement on the function and all its derivatives (to infinite order), and those imposing periodicity on the function and its derivatives up to a finite order $k$ ($kgeqslant 0$). The former will be referred to as $C^{infty}$ periodic conditions, and the latter $C^{k}$ periodic conditions. We define operations that constitute a $C^{infty}$ periodic layer and a $C^k$ periodic layer (for any $kgeqslant 0$). A deep neural network with a $C^{infty}$ (or $C^k$) periodic layer incorporated as the second layer automatically and exactly satisfies the $C^{infty}$ (or $C^k$) periodic conditions. We present extensive numerical experiments on ordinary and partial differential equations with $C^{infty}$ and $C^k$ periodic boundary conditions to verify and demonstrate that the proposed method indeed enforces exactly, to the machine accuracy, the periodicity for the DNN solution and its derivatives.
Support vector machines (SVMs) are successful modeling and prediction tools with a variety of applications. Previous work has demonstrated the superiority of the SVMs in dealing with the high dimensional, low sample size problems. However, the numeri cal difficulties of the SVMs will become severe with the increase of the sample size. Although there exist many solvers for the SVMs, only few of them are designed by exploiting the special structures of the SVMs. In this paper, we propose a highly efficient sparse semismooth Newton based augmented Lagrangian method for solving a large-scale convex quadratic programming problem with a linear equality constraint and a simple box constraint, which is generated from the dual problems of the SVMs. By leveraging the primal-dual error bound result, the fast local convergence rate of the augmented Lagrangian method can be guaranteed. Furthermore, by exploiting the second-order sparsity of the problem when using the semismooth Newton method,the algorithm can efficiently solve the aforementioned difficult problems. Finally, numerical comparisons demonstrate that the proposed algorithm outperforms the current state-of-the-art solvers for the large-scale SVMs.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا