ترغب بنشر مسار تعليمي؟ اضغط هنا

Multi-reference factor analysis: low-rank covariance estimation under unknown translations

302   0   0.0 ( 0 )
 نشر من قبل Boris Landa
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We consider the problem of estimating the covariance matrix of a random signal observed through unknown translations (modeled by cyclic shifts) and corrupted by noise. Solving this problem allows to discover low-rank structures masked by the existence of translations (which act as nuisance parameters), with direct application to Principal Components Analysis (PCA). We assume that the underlying signal is of length $L$ and follows a standard factor model with mean zero and $r$ normally-distributed factors. To recover the covariance matrix in this case, we propose to employ the second- and fourth-order shift-invariant moments of the signal known as the $textit{power spectrum}$ and the $textit{trispectrum}$. We prove that they are sufficient for recovering the covariance matrix (under a certain technical condition) when $r<sqrt{L}$. Correspondingly, we provide a polynomial-time procedure for estimating the covariance matrix from many (translated and noisy) observations, where no explicit knowledge of $r$ is required, and prove the procedures statistical consistency. While our results establish that covariance estimation is possible from the power spectrum and the trispectrum for low-rank covariance matrices, we prove that this is not the case for full-rank covariance matrices. We conduct numerical experiments that corroborate our theoretical findings, and demonstrate the favorable performance of our algorithms in various settings, including in high levels of noise.



قيم البحث

اقرأ أيضاً

In recent years, there is a growing need for processing methods aimed at extracting useful information from large datasets. In many cases the challenge is to discover a low-dimensional structure in the data, often concealed by the existence of nuisan ce parameters and noise. Motivated by such challenges, we consider the problem of estimating a signal from its scaled, cyclically-shifted and noisy observations. We focus on the particularly challenging regime of low signal-to-noise ratio (SNR), where different observations cannot be shift-aligned. We show that an accurate estimation of the signal from its noisy observations is possible, and derive a procedure which is proved to consistently estimate the signal. The asymptotic sample complexity (the number of observations required to recover the signal) of the procedure is $1/operatorname{SNR}^4$. Additionally, we propose a procedure which is experimentally shown to improve the sample complexity by a factor equal to the signals length. Finally, we present numerical experiments which demonstrate the performance of our algorithms, and corroborate our theoretical findings.
We consider the problem of estimating a low rank covariance function $K(t,u)$ of a Gaussian process $S(t), tin [0,1]$ based on $n$ i.i.d. copies of $S$ observed in a white noise. We suggest a new estimation procedure adapting simultaneously to the lo w rank structure and the smoothness of the covariance function. The new procedure is based on nuclear norm penalization and exhibits superior performances as compared to the sample covariance function by a polynomial factor in the sample size $n$. Other results include a minimax lower bound for estimation of low-rank covariance functions showing that our procedure is optimal as well as a scheme to estimate the unknown noise variance of the Gaussian process.
Motivated by geometric problems in signal processing, computer vision, and structural biology, we study a class of orbit recovery problems where we observe very noisy copies of an unknown signal, each acted upon by a random element of some group (suc h as Z/p or SO(3)). The goal is to recover the orbit of the signal under the group action in the high-noise regime. This generalizes problems of interest such as multi-reference alignment (MRA) and the reconstruction problem in cryo-electron microscopy (cryo-EM). We obtain matching lower and upper bounds on the sample complexity of these problems in high generality, showing that the statistical difficulty is intricately determined by the invariant theory of the underlying symmetry group. In particular, we determine that for cryo-EM with noise variance $sigma^2$ and uniform viewing directions, the number of samples required scales as $sigma^6$. We match this bound with a novel algorithm for ab initio reconstruction in cryo-EM, based on invariant features of degree at most 3. We further discuss how to recover multiple molecular structures from heterogeneous cryo-EM samples.
We study a panel data model with general heterogeneous effects where slopes are allowed to vary across both individuals and over time. The key dimension reduction assumption we employ is that the heterogeneous slopes can be expressed as having a fact or structure so that the high-dimensional slope matrix is low-rank and can thus be estimated using low-rank regularized regression. We provide a simple multi-step estimation procedure for the heterogeneous effects. The procedure makes use of sample-splitting and orthogonalization to accommodate inference following the use of penalized low-rank estimation. We formally verify that the resulting estimator is asymptotically normal allowing simple construction of inferential statements for {the individual-time-specific effects and for cross-sectional averages of these effects}. We illustrate the proposed method in simulation experiments and by estimating the effect of the minimum wage on employment.
We revisit the problem of estimating the mean of a real-valued distribution, presenting a novel estimator with sub-Gaussian convergence: intuitively, our estimator, on any distribution, is as accurate as the sample mean is for the Gaussian distributi on of matching variance. Crucially, in contrast to prior works, our estimator does not require prior knowledge of the variance, and works across the entire gamut of distributions with bounded variance, including those without any higher moments. Parameterized by the sample size $n$, the failure probability $delta$, and the variance $sigma^2$, our estimator is accurate to within $sigmacdot(1+o(1))sqrt{frac{2logfrac{1}{delta}}{n}}$, tight up to the $1+o(1)$ factor. Our estimator construction and analysis gives a framework generalizable to other problems, tightly analyzing a sum of dependent random variables by viewing the sum implicitly as a 2-parameter $psi$-estimator, and constructing bounds using mathematical programming and duality techniques.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا