Multi-reference factor analysis: low-rank covariance estimation under unknown translations


Abstract in English

We consider the problem of estimating the covariance matrix of a random signal observed through unknown translations (modeled by cyclic shifts) and corrupted by noise. Solving this problem allows to discover low-rank structures masked by the existence of translations (which act as nuisance parameters), with direct application to Principal Components Analysis (PCA). We assume that the underlying signal is of length $L$ and follows a standard factor model with mean zero and $r$ normally-distributed factors. To recover the covariance matrix in this case, we propose to employ the second- and fourth-order shift-invariant moments of the signal known as the $textit{power spectrum}$ and the $textit{trispectrum}$. We prove that they are sufficient for recovering the covariance matrix (under a certain technical condition) when $r<sqrt{L}$. Correspondingly, we provide a polynomial-time procedure for estimating the covariance matrix from many (translated and noisy) observations, where no explicit knowledge of $r$ is required, and prove the procedures statistical consistency. While our results establish that covariance estimation is possible from the power spectrum and the trispectrum for low-rank covariance matrices, we prove that this is not the case for full-rank covariance matrices. We conduct numerical experiments that corroborate our theoretical findings, and demonstrate the favorable performance of our algorithms in various settings, including in high levels of noise.

Download