No Arabic abstract
In an iterative approach for solving linear systems with ill-conditioned, symmetric positive definite (SPD) kernel matrices, both fast matrix-vector products and fast preconditioning operations are required. Fast (linear-scaling) matrix-vector products are available by expressing the kernel matrix in an $mathcal{H}^2$ representation or an equivalent fast multipole method representation. Preconditioning such matrices, however, requires a structured matrix approximation that is more regular than the $mathcal{H}^2$ representation, such as the hierarchically semiseparable (HSS) matrix representation, which provides fast solve operations. Previously, an algorithm was presented to construct an HSS approximation to an SPD kernel matrix that is guaranteed to be SPD. However, this algorithm has quadratic cost and was only designed for recursive binary partitionings of the points defining the kernel matrix. This paper presents a general algorithm for constructing an SPD HSS approximation. Importantly, the algorithm uses the $mathcal{H}^2$ representation of the SPD matrix to reduce its computational complexity from quadratic to quasilinear. Numerical experiments illustrate how this SPD HSS approximation performs as a preconditioner for solving linear systems arising from a range of kernel functions.
Positive semi-definite matrices commonly occur as normal matrices of least squares problems in statistics or as kernel matrices in machine learning and approximation theory. They are typically large and dense. Thus algorithms to solve systems with such a matrix can be very costly. A core idea to reduce computational complexity is to approximate the matrix by one with a low rank. The optimal and well understood choice is based on the eigenvalue decomposition of the matrix. Unfortunately, this is computationally very expensive. Cheaper methods are based on Gaussian elimination but they require pivoting. We will show how invariant matrix theory provides explicit error formulas for an averaged error based on volume sampling. The formula leads to ratios of elementary symmetric polynomials on the eigenvalues. We discuss some new an old bounds and include several examples where an expected error norm can be computed exactly.
We present the Cholesky-factored symmetric positive definite neural network (SPD-NN) for modeling constitutive relations in dynamical equations. Instead of directly predicting the stress, the SPD-NN trains a neural network to predict the Cholesky factor of a tangent stiffness matrix, based on which the stress is calculated in the incremental form. As a result of the special structure, SPD-NN weakly imposes convexity on the strain energy function, satisfies time consistency for path-dependent materials, and therefore improves numerical stability, especially when the SPD-NN is used in finite element simulations. Depending on the types of available data, we propose two training methods, namely direct training for strain and stress pairs and indirect training for loads and displacement pairs. We demonstrate the effectiveness of SPD-NN on hyperelastic, elasto-plastic, and multiscale fiber-reinforced plate problems from solid mechanics. The generality and robustness of the SPD-NN make it a promising tool for a wide range of constitutive modeling applications.
Spatial symmetries and invariances play an important role in the description of materials. When modelling material properties, it is important to be able to respect such invariances. Here we discuss how to model and generate random ensembles of tensors where one wants to be able to prescribe certain classes of spatial symmetries and invariances for the whole ensemble, while at the same time demanding that the mean or expected value of the ensemble be subject to a possibly higher spatial invariance class. Our special interest is in the class of physically symmetric and positive definite tensors, as they appear often in the description of materials. As the set of positive definite tensors is not a linear space, but rather an open convex cone in the linear vector space of physically symmetric tensors, it may be advantageous to widen the notion of mean to the so-called Frechet mean, which is based on distance measures between positive definite tensors other than the usual Euclidean one. For the sake of simplicity, as well as to expose the main idea as clearly as possible, we limit ourselves here to second order tensors. It is shown how the random ensemble can be modelled and generated, with fine control of the spatial symmetry or invariance of the whole ensemble, as well as its Frechet mean, independently in its scaling and directional aspects. As an example, a 2D and a 3D model of steady-state heat conduction in a human proximal femur, a bone with high material anisotropy, is explored. It is modelled with a random thermal conductivity tensor, and the numerical results show the distinct impact of incorporating into the constitutive model different material uncertainties$-$scaling, orientation, and prescribed material symmetry$-$on the desired quantities of interest, such as temperature distribution and heat flux.
We explore the connection between two problems that have arisen independently in the signal processing and related fields: the estimation of the geometric mean of a set of symmetric positive definite (SPD) matrices and their approximate joint diagonalization (AJD). Today there is a considerable interest in estimating the geometric mean of a SPD matrix set in the manifold of SPD matrices endowed with the Fisher information metric. The resulting mean has several important invariance properties and has proven very useful in diverse engineering applications such as biomedical and image data processing. While for two SPD matrices the mean has an algebraic closed form solution, for a set of more than two SPD matrices it can only be estimated by iterative algorithms. However, none of the existing iterative algorithms feature at the same time fast convergence, low computational complexity per iteration and guarantee of convergence. For this reason, recently other definitions of geometric mean based on symmetric divergence measures, such as the Bhattacharyya divergence, have been considered. The resulting means, although possibly useful in practice, do not satisfy all desirable invariance properties. In this paper we consider geometric means of co-variance matrices estimated on high-dimensional time-series, assuming that the data is generated according to an instantaneous mixing model, which is very common in signal processing. We show that in these circumstances we can approximate the Fisher information geometric mean by employing an efficient AJD algorithm. Our approximation is in general much closer to the Fisher information geometric mean as compared to its competitors and verifies many invariance properties. Furthermore, convergence is guaranteed, the computational complexity is low and the convergence rate is quadratic. The accuracy of this new geometric mean approximation is demonstrated by means of simulations.
In this paper, we develop a new classification method for manifold-valued data in the framework of probabilistic learning vector quantization. In many classification scenarios, the data can be naturally represented by symmetric positive definite matrices, which are inherently points that live on a curved Riemannian manifold. Due to the non-Euclidean geometry of Riemannian manifolds, traditional Euclidean machine learning algorithms yield poor results on such data. In this paper, we generalize the probabilistic learning vector quantization algorithm for data points living on the manifold of symmetric positive definite matrices equipped with Riemannian natural metric (affine-invariant metric). By exploiting the induced Riemannian distance, we derive the probabilistic learning Riemannian space quantization algorithm, obtaining the learning rule through Riemannian gradient descent. Empirical investigations on synthetic data, image data , and motor imagery EEG data demonstrate the superior performance of the proposed method.