No Arabic abstract
In data processing and machine learning, an important challenge is to recover and exploit models that can represent accurately the data. We consider the problem of recovering Gaussian mixture models from datasets. We investigate symmetric tensor decomposition methods for tackling this problem, where the tensor is built from empirical moments of the data distribution. We consider identifiable tensors, which have a unique decomposition, showing that moment tensors built from spherical Gaussian mixtures have this property. We prove that symmetric tensors with interpolation degree strictly less than half their order are identifiable and we present an algorithm, based on simple linear algebra operations, to compute their decomposition. Illustrative experimentations show the impact of the tensor decomposition method for recovering Gaussian mixtures, in comparison with other state-of-the-art approaches.
We analyze the decomposition problem of multivariate polynomial-exponential functions from truncated series and present new algorithms to compute their decomposition. Using the duality between polynomials and formal power series, we first show how the elements in the dual of an Artinian algebra correspond to polynomial-exponential functions. They are also the solutions of systems of partial differential equations with constant coefficients. We relate their representation to the inverse system of the roots of the characteristic variety. Using the properties of Hankel operators, we establish a correspondence between polynomial exponential series and Artinian Gorenstein algebras. We generalize Kronecker theorem to the multivariate case, by showing that the symbol of a Hankel operator of finite rank is a polynomial-exponential series and by connecting the rank of the Hankel operator with the decomposition of the symbol. A generalization of Pronys approach to multivariate decomposition problems is presented , exploiting eigenvector methods for solving polynomial equations. We show how to compute the frequencies and weights of a minimal polynomial-exponential decomposition , using the first coefficients of the series. A key ingredient of the approach is the flat extension criteria, which leads to a multivariate generalization of a rank condition for a Carath{e}odory-Fej{e}r decomposition of multivariate Hankel matrices. A new algorithm is given to compute a basis of the Artinian Gorenstein algebra, based on a Gram-Schmidt orthogonalization process and to decompose polynomial-exponential series. A general framework for the applications of this approach is described and illustrated in different problems. We provide Kronecker-type theorems for convolution operators, showing that a convolution operator (or a cross-correlation operator) is of finite rank, if and only if, its symbol is a polynomial-exponential function, and we relate its rank to the decomposition of its symbol. We also present Kronecker-type theorems for the reconstruction of measures as weighted sums of Dirac measures from moments and for the decomposition of polynomial-exponential functions from values. Finally, we describe an application of this method for the sparse interpolation of polylog functions from values.
We propose a method (TT-GP) for approximate inference in Gaussian Process (GP) models. We build on previous scalable GP research including stochastic variational inference based on inducing inputs, kernel interpolation, and structure exploiting algebra. The key idea of our method is to use Tensor Train decomposition for variational parameters, which allows us to train GPs with billions of inducing inputs and achieve state-of-the-art results on several benchmarks. Further, our approach allows for training kernels based on deep neural networks without any modifications to the underlying GP model. A neural network learns a multidimensional embedding for the data, which is used by the GP to make the final prediction. We train GP and neural network parameters end-to-end without pretraining, through maximization of GP marginal likelihood. We show the efficiency of the proposed approach on several regression and classification benchmark datasets including MNIST, CIFAR-10, and Airline.
We consider the problem of decomposing higher-order moment tensors, i.e., the sum of symmetric outer products of data vectors. Such a decomposition can be used to estimate the means in a Gaussian mixture model and for other applications in machine learning. The $d$th-order empirical moment tensor of a set of $p$ observations of $n$ variables is a symmetric $d$-way tensor. Our goal is to find a low-rank tensor approximation comprising $r ll p$ symmetric outer products. The challenge is that forming the empirical moment tensors costs $O(pn^d)$ operations and $O(n^d)$ storage, which may be prohibitively expensive; additionally, the algorithm to compute the low-rank approximation costs $O(n^d)$ per iteration. Our contribution is avoiding formation of the moment tensor, computing the low-rank tensor approximation of the moment tensor implicitly using $O(pnr)$ operations per iteration and no extra memory. This advance opens the door to more applications of higher-order moments since they can now be efficiently computed. We present numerical evidence of the computational savings and show an example of estimating the means for higher-order moments.
Machine learning techniques allow a direct mapping of atomic positions and nuclear charges to the potential energy surface with almost ab-initio accuracy and the computational efficiency of empirical potentials. In this work we propose a machine learning method for constructing high-dimensional potential energy surfaces based on feed-forward neural networks. As input to the neural network we propose an extendable invariant local molecular descriptor constructed from geometric moments. Their formulation via pairwise distance vectors and tensor contractions allows a very efficient implementation on graphical processing units (GPUs). The atomic species is encoded in the molecular descriptor, which allows the restriction to one neural network for the training of all atomic species in the data set. We demonstrate that the accuracy of the developed approach in representing both chemical and configurational spaces is comparable to the one of several established machine learning models. Due to its high accuracy and efficiency, the proposed machine-learned potentials can be used for any further tasks, for example the optimization of molecular geometries, the calculation of rate constants or molecular dynamics.
The tensor decomposition addressed in this paper may be seen as a generalisation of Singular Value Decomposition of matrices. We consider general multilinear and multihomogeneous tensors. We show how to reduce the problem to a truncated moment matrix problem and give a new criterion for flat extension of Quasi-Hankel matrices. We connect this criterion to the commutation characterisation of border bases. A new algorithm is described. It applies for general multihomogeneous tensors, extending the approach of J.J. Sylvester to binary forms. An example illustrates the algebraic operations involved in this approach and how the decomposition can be recovered from eigenvector computation.