No Arabic abstract
In this paper, we propose a general framework for tensor singular value decomposition (tensor SVD), which focuses on the methodology and theory for extracting the hidden low-rank structure from high-dimensional tensor data. Comprehensive results are developed on both the statistical and computational limits for tensor SVD. This problem exhibits three different phases according to the signal-to-noise ratio (SNR). In particular, with strong SNR, we show that the classical higher-order orthogonal iteration achieves the minimax optimal rate of convergence in estimation; with weak SNR, the information-theoretical lower bound implies that it is impossible to have consistent estimation in general; with moderate SNR, we show that the non-convex maximum likelihood estimation provides optimal solution, but with NP-hard computational cost; moreover, under the hardness hypothesis of hypergraphic planted clique detection, there are no polynomial-time algorithms performing consistently in general.
We consider the phase retrieval problem of reconstructing a $n$-dimensional real or complex signal $mathbf{X}^{star}$ from $m$ (possibly noisy) observations $Y_mu = | sum_{i=1}^n Phi_{mu i} X^{star}_i/sqrt{n}|$, for a large class of correlated real and complex random sensing matrices $mathbf{Phi}$, in a high-dimensional setting where $m,ntoinfty$ while $alpha = m/n=Theta(1)$. First, we derive sharp asymptotics for the lowest possible estimation error achievable statistically and we unveil the existence of sharp phase transitions for the weak- and full-recovery thresholds as a function of the singular values of the matrix $mathbf{Phi}$. This is achieved by providing a rigorous proof of a result first obtained by the replica method from statistical mechanics. In particular, the information-theoretic transition to perfect recovery for full-rank matrices appears at $alpha=1$ (real case) and $alpha=2$ (complex case). Secondly, we analyze the performance of the best-known polynomial time algorithm for this problem -- approximate message-passing -- establishing the existence of a statistical-to-algorithmic gap depending, again, on the spectral properties of $mathbf{Phi}$. Our work provides an extensive classification of the statistical and algorithmic thresholds in high-dimensional phase retrieval for a broad class of random matrices.
We study the problem of distributional approximations to high-dimensional non-degenerate $U$-statistics with random kernels of diverging orders. Infinite-order $U$-statistics (IOUS) are a useful tool for constructing simultaneous prediction intervals that quantify the uncertainty of ensemble methods such as subbagging and random forests. A major obstacle in using the IOUS is their computational intractability when the sample size and/or order are large. In this article, we derive non-asymptotic Gaussian approximation error bounds for an incomplete version of the IOUS with a random kernel. We also study data-driven inferential methods for the incomplete IOUS via bootstraps and develop their statistical and computational guarantees.
In this work we study the fundamental limits of approximate recovery in the context of group testing. One of the most well-known, theoretically optimal, and easy to implement testing procedures is the non-adaptive Bernoulli group testing problem, where all tests are conducted in parallel, and each item is chosen to be part of any certain test independently with some fixed probability. In this setting, there is an observed gap between the number of tests above which recovery is information theoretically (IT) possible, and the number of tests required by the currently best known efficient algorithms to succeed. Often times such gaps are explained by a phase transition in the landscape of the solution space of the problem (an Overlap Gap Property phase transition). In this paper we seek to understand whether such a phenomenon takes place for Bernoulli group testing as well. Our main contributions are the following: (1) We provide first moment evidence that, perhaps surprisingly, such a phase transition does not take place throughout the regime for which recovery is IT possible. This fact suggests that the model is in fact amenable to local search algorithms ; (2) we prove the complete absence of bad local minima for a part of the hard regime, a fact which implies an improvement over known theoretical results on the performance of efficient algorithms for approximate recovery without false-negatives, and finally (3) we present extensive simulations that strongly suggest that a very simple local algorithm known as Glauber Dynamics does indeed succeed, and can be used to efficiently implement the well-known (theoretically optimal) Smallest Satisfying Set (SSS) estimator.
We propose a novel probabilistic method for detection of objects in noisy images. The method uses results from percolation and random graph theories. We present an algorithm that allows to detect objects of unknown shapes in the presence of random noise. The algorithm has linear complexity and exponential accuracy and is appropriate for real-time systems. We prove results on consistency and algorithmic complexity of our procedure.
We propose a new fast streaming algorithm for the tensor completion problem of imputing missing entries of a low-tubal-rank tensor using the tensor singular value decomposition (t-SVD) algebraic framework. We show the t-SVD is a specialization of the well-studied block-term decomposition for third-order tensors, and we present an algorithm under this model that can track changing free submodules from incomplete streaming 2-D data. We use principles from incremental gradient descent on the Grassmann manifold of subspaces to solve the tensor completion problem with linear complexity and constant memory in the number of time samples. Our results are competitive in accuracy but much faster in compute time than state-of-the-art tensor completion algorithms on real applications to recover temporal chemo-sensing and MRI data under limited sampling.