ﻻ يوجد ملخص باللغة العربية
We introduce a fast algorithm for entry-wise evaluation of the Gauss-Newton Hessian (GNH) matrix for the fully-connected feed-forward neural network. The algorithm has a precomputation step and a sampling step. While it generally requires $O(Nn)$ work to compute an entry (and the entire column) in the GNH matrix for a neural network with $N$ parameters and $n$ data points, our fast sampling algorithm reduces the cost to $O(n+d/epsilon^2)$ work, where $d$ is the output dimension of the network and $epsilon$ is a prescribed accuracy (independent of $N$). One application of our algorithm is constructing the hierarchical-matrix (H-matrix) approximation of the GNH matrix for solving linear systems and eigenvalue problems. It generally requires $O(N^2)$ memory and $O(N^3)$ work to store and factorize the GNH matrix, respectively. The H-matrix approximation requires only $O(N r_o)$ memory footprint and $O(N r_o^2)$ work to be factorized, where $r_o ll N$ is the maximum rank of off-diagonal blocks in the GNH matrix. We demonstrate the performance of our fast algorithm and the H-matrix approximation on classification and autoencoder neural networks.
This paper is concerned with the introduction of Tikhonov regularization into least squares approximation scheme on $[-1,1]$ by orthonormal polynomials, in order to handle noisy data. This scheme includes interpolation and hyperinterpolation as speci
We analyze the Lanczos method for matrix function approximation (Lanczos-FA), an iterative algorithm for computing $f(mathbf{A}) mathbf{b}$ when $mathbf{A}$ is a Hermitian matrix and $mathbf{b}$ is a given mathbftor. Assuming that $f : mathbb{C} righ
We consider a scalar function depending on a numerical solution of an initial value problem, and its second-derivative (Hessian) matrix for the initial value. The need to extract the information of the Hessian or to solve a linear system having the H
In this article, we present an $O(N log N)$ rapidly convergent algorithm for the numerical approximation of the convolution integral with radially symmetric weakly singular kernels and compactly supported densities. To achieve the reduced computation
Let $A$ be a square complex matrix; $z_1$, ..., $z_{N}inmathbb C$ be arbitrary (possibly repetitive) points of interpolation; $f$ be an analytic function defined on a neighborhood of the convex hull of the union of the spectrum $sigma(A)$ of the matr