Provably convergent acceleration in factored gradient descent with applications in matrix sensing


Abstract in English

We present theoretical results on the convergence of emph{non-convex} accelerated gradient descent in matrix factorization models with $ell_2$-norm loss. The purpose of this work is to study the effects of acceleration in non-convex settings, where provable convergence with acceleration should not be considered a emph{de facto} property. The technique is applied to matrix sensing problems, for the estimation of a rank $r$ optimal solution $X^star in mathbb{R}^{n times n}$. Our contributions can be summarized as follows. $i)$ We show that acceleration in factored gradient descent converges at a linear rate; this fact is novel for non-convex matrix factorization settings, under common assumptions. $ii)$ Our proof technique requires the acceleration parameter to be carefully selected, based on the properties of the problem, such as the condition number of $X^star$ and the condition number of objective function. $iii)$ Currently, our proof leads to the same dependence on the condition number(s) in the contraction parameter, similar to recent results on non-accelerated algorithms. $iv)$ Acceleration is observed in practice, both in synthetic examples and in two real applications: neuronal multi-unit activities recovery from single electrode recordings, and quantum state tomography on quantum computing simulators.

Download