No Arabic abstract
How many random entries of an n by m, rank r matrix are necessary to reconstruct the matrix within an accuracy d? We address this question in the case of a random matrix with bounded rank, whereby the observed entries are chosen uniformly at random. We prove that, for any d>0, C(r,d)n observations are sufficient. Finally we discuss the question of reconstructing the matrix efficiently, and demonstrate through extensive simulations that this task can be accomplished in nPoly(log n) operations, for small rank.
A distance matrix $A in mathbb R^{n times m}$ represents all pairwise distances, $A_{ij}=mathrm{d}(x_i,y_j)$, between two point sets $x_1,...,x_n$ and $y_1,...,y_m$ in an arbitrary metric space $(mathcal Z, mathrm{d})$. Such matrices arise in various computational contexts such as learning image manifolds, handwriting recognition, and multi-dimensional unfolding. In this work we study algorithms for low-rank approximation of distance matrices. Recent work by Bakshi and Woodruff (NeurIPS 2018) showed it is possible to compute a rank-$k$ approximation of a distance matrix in time $O((n+m)^{1+gamma}) cdot mathrm{poly}(k,1/epsilon)$, where $epsilon>0$ is an error parameter and $gamma>0$ is an arbitrarily small constant. Notably, their bound is sublinear in the matrix size, which is unachievable for general matrices. We present an algorithm that is both simpler and more efficient. It reads only $O((n+m) k/epsilon)$ entries of the input matrix, and has a running time of $O(n+m) cdot mathrm{poly}(k,1/epsilon)$. We complement the sample complexity of our algorithm with a matching lower bound on the number of entries that must be read by any algorithm. We provide experimental results to validate the approximation quality and running time of our algorithm.
For an $n times n$ matrix $M$ with entries in $mathbb{Z}_2$ denote by $R(M)$ the minimal rank of all the matrices obtained by changing some numbers on the main diagonal of $M$. We prove that for each non-negative integer $k$ there is a polynomial in $n$ algorithm deciding whether $R(M) leq k$ (whose complexity may depend on $k$). We also give a polynomial in $n$ algorithm computing a number $m$ such that $m/2 leq R(M) leq m$. These results have applications to graph drawings on non-orientable surfaces.
In this paper we consider the problem of computing the likelihood of the profile of a discrete distribution, i.e., the probability of observing the multiset of element frequencies, and computing a profile maximum likelihood (PML) distribution, i.e., a distribution with the maximum profile likelihood. For each problem we provide polynomial time algorithms that given $n$ i.i.d. samples from a discrete distribution, achieve an approximation factor of $expleft(-O(sqrt{n} log n) right)$, improving upon the previous best-known bound achievable in polynomial time of $exp(-O(n^{2/3} log n))$ (Charikar, Shiragur and Sidford, 2019). Through the work of Acharya, Das, Orlitsky and Suresh (2016), this implies a polynomial time universal estimator for symmetric properties of discrete distributions in a broader range of error parameter. We achieve these results by providing new bounds on the quality of approximation of the Bethe and Sinkhorn permanents (Vontobel, 2012 and 2014). We show that each of these are $exp(O(k log(N/k)))$ approximations to the permanent of $N times N$ matrices with non-negative rank at most $k$, improving upon the previous known bounds of $exp(O(N))$. To obtain our results on PML, we exploit the fact that the PML objective is proportional to the permanent of a certain Vandermonde matrix with $sqrt{n}$ distinct columns, i.e. with non-negative rank at most $sqrt{n}$. As a by-product of our work we establish a surprising connection between the convex relaxation in prior work (CSS19) and the well-studied Bethe and Sinkhorn approximations.
We consider ensembles of real symmetric band matrices with entries drawn from an infinite sequence of exchangeable random variables, as far as the symmetry of the matrices permits. In general the entries of the upper triangular parts of these matrices are correlated and no smallness or sparseness of these correlations is assumed. It is shown that the eigenvalue distribution measures still converge to a semicircle but with random scaling. We also investigate the asymptotic behavior of the corresponding $ell_2$-operator norms. The key to our analysis is a generalisation of a classic result by de Finetti that allows to represent the underlying probability spaces as averages of Wigner band ensembles with entries that are not necessarily centred. Some of our results appear to be new even for such Wigner band matrices.
In this paper, we propose a new algorithm for recovery of low-rank matrices from compressed linear measurements. The underlying idea of this algorithm is to closely approximate the rank function with a smooth function of singular values, and then minimize the resulting approximation subject to the linear constraints. The accuracy of the approximation is controlled via a scaling parameter $delta$, where a smaller $delta$ corresponds to a more accurate fitting. The consequent optimization problem for any finite $delta$ is nonconvex. Therefore, in order to decrease the risk of ending up in local minima, a series of optimizations is performed, starting with optimizing a rough approximation (a large $delta$) and followed by successively optimizing finer approximations of the rank with smaller $delta$s. To solve the optimization problem for any $delta > 0$, it is converted to a new program in which the cost is a function of two auxiliary positive semidefinete variables. The paper shows that this new program is concave and applies a majorize-minimize technique to solve it which, in turn, leads to a few convex optimization iterations. This optimization scheme is also equivalent to a reweighted Nuclear Norm Minimization (NNM), where weighting update depends on the used approximating function. For any $delta > 0$, we derive a necessary and sufficient condition for the exact recovery which are weaker than those corresponding to NNM. On the numerical side, the proposed algorithm is compared to NNM and a reweighted NNM in solving affine rank minimization and matrix completion problems showing its considerable and consistent superiority in terms of success rate, especially, when the number of measurements decreases toward the lower-bound for the unique representation.