No Arabic abstract
We exhibit a randomized algorithm which given a square $ntimes n$ complex matrix $A$ with $|A| le 1$ and $delta>0$, computes with high probability invertible $V$ and diagonal $D$ such that $$|A-VDV^{-1}|le delta $$ and $|V||V^{-1}| le O(n^{2.5}/delta)$ in $O(T_{MM}>(n)log^2(n/delta))$ arithmetic operations on a floating point machine with $O(log^4(n/delta)log n)$ bits of precision. Here $T_{MM}>(n)$ is the number of arithmetic operations required to multiply two $ntimes n$ complex matrices numerically stably, with $T_{MM},,(n)=O(n^{omega+eta}>>)$ for every $eta>0$, where $omega$ is the exponent of matrix multiplication. The algorithm is a variant of the spectral bisection algorithm in numerical linear algebra (Beavers and Denman, 1974). This running time is optimal up to polylogarithmic factors, in the sense that verifying that a given similarity diagonalizes a matrix requires at least matrix multiplication time. It significantly improves best previously provable running times of $O(n^{10}/delta^2)$ arithmetic operations for diagonalization of general matrices (Armentano et al., 2018), and (w.r.t. dependence on $n$) $O(n^3)$ arithmetic operations for Hermitian matrices (Parlett, 1998). The proof rests on two new ingredients. (1) We show that adding a small complex Gaussian perturbation to any matrix splits its pseudospectrum into $n$ small well-separated components. This implies that the eigenvalues of the perturbation have a large minimum gap, a property of independent interest in random matrix theory. (2) We rigorously analyze Roberts Newton iteration method for computing the matrix sign function in finite arithmetic, itself an open problem in numerical analysis since at least 1986. This is achieved by controlling the evolution the iterates pseudospectra using a carefully chosen sequence of shrinking contour integrals in the complex plane.
It is proved that among the rational iterations locally converging with order s>1 to the sign function, the Pade iterations and their reciprocals are the unique rationals with the lowest sum of the degrees of numerator and denominator.
We present a probabilistic algorithm to compute the product of two univariate sparse polynomials over a field with a number of bit operations that is quasi-linear in the size of the input and the output. Our algorithm works for any field of characteristic zero or larger than the degree. We mainly rely on sparse interpolation and on a new algorithm for verifying a sparse product that has also a quasi-linear time complexity. Using Kronecker substitution techniques we extend our result to the multivariate case.
We present randomized algorithms to compute the sumset (Minkowski sum) of two integer sets, and to multiply two univariate integer polynomials given by sparse representations. Our algorithm for sumset has cost softly linear in the combined size of the inputs and output. This is used as part of our sparse multiplication algorithm, whose cost is softly linear in the combined size of the inputs, output, and the sumset of the supports of the inputs. As a subroutine, we present a new method for computing the coefficients of a sparse polynomial, given a set containing its support. Our multiplication algorithm extends to multivariate Laurent polynomials over finite fields and rational numbers. Our techniques are based on sparse interpolation algorithms and results from analytic number theory.
Matrix scaling and matrix balancing are two basic linear-algebraic problems with a wide variety of applications, such as approximating the permanent, and pre-conditioning linear systems to make them more numerically stable. We study the power and limitations of quantum algorithms for these problems. We provide quantum implementations of two classical (in both senses of the word) methods: Sinkhorns algorithm for matrix scaling and Osbornes algorithm for matrix balancing. Using amplitude estimation as our main tool, our quantum implementations both run in time $tilde O(sqrt{mn}/varepsilon^4)$ for scaling or balancing an $n times n$ matrix (given by an oracle) with $m$ non-zero entries to within $ell_1$-error $varepsilon$. Their classical analogs use time $tilde O(m/varepsilon^2)$, and every classical algorithm for scaling or balancing with small constant $varepsilon$ requires $Omega(m)$ queries to the entries of the input matrix. We thus achieve a polynomial speed-up in terms of $n$, at the expense of a worse polynomial dependence on the obtained $ell_1$-error $varepsilon$. We emphasize that even for constant $varepsilon$ these problems are already non-trivial (and relevant in applications). Along the way, we extend the classical analysis of Sinkhorns and Osbornes algorithm to allow for errors in the computation of marginals. We also adapt an improved analysis of Sinkhorns algorithm for entrywise-positive matrices to the $ell_1$-setting, leading to an $tilde O(n^{1.5}/varepsilon^3)$-time quantum algorithm for $varepsilon$-$ell_1$-scaling in this case. We also prove a lower bound, showing that our quantum algorithm for matrix scaling is essentially optimal for constant $varepsilon$: every quantum algorithm for matrix scaling that achieves a constant $ell_1$-error with respect to uniform marginals needs to make at least $Omega(sqrt{mn})$ queries.
We study coded distributed matrix multiplication from an approximate recovery viewpoint. We consider a system of $P$ computation nodes where each node stores $1/m$ of each multiplicand via linear encoding. Our main result shows that the matrix product can be recovered with $epsilon$ relative error from any $m$ of the $P$ nodes for any $epsilon > 0$. We obtain this result through a careful specialization of MatDot codes -- a class of matrix multiplication codes previously developed in the context of exact recovery ($epsilon=0$). Since prior results showed that MatDot codes achieve the best exact recovery threshold for a class of linear coding schemes, our result shows that allowing for mild approximations leads to a system that is nearly twice as efficient as exact reconstruction. As an additional contribution, we develop an optimization framework based on alternating minimization that enables the discovery of new codes for approximate matrix multiplication.