Do you want to publish a course? Click here

The All-or-Nothing Phenomenon in Sparse Tensor PCA

171   0   0.0 ( 0 )
 Added by Ilias Zadik
 Publication date 2020
and research's language is English




Ask ChatGPT about the research

We study the statistical problem of estimating a rank-one sparse tensor corrupted by additive Gaussian noise, a model also known as sparse tensor PCA. We show that for Bernoulli and Bernoulli-Rademacher distributed signals and emph{for all} sparsity levels which are sublinear in the dimension of the signal, the sparse tensor PCA model exhibits a phase transition called the emph{all-or-nothing phenomenon}. This is the property that for some signal-to-noise ratio (SNR) $mathrm{SNR_c}$ and any fixed $epsilon>0$, if the SNR of the model is below $left(1-epsilonright)mathrm{SNR_c}$, then it is impossible to achieve any arbitrarily small constant correlation with the hidden signal, while if the SNR is above $left(1+epsilon right)mathrm{SNR_c}$, then it is possible to achieve almost perfect correlation with the hidden signal. The all-or-nothing phenomenon was initially established in the context of sparse linear regression, and over the last year also in the context of sparse 2-tensor (matrix) PCA, Bernoulli group testing, and generalized linear models. Our results follow from a more general result showing that for any Gaussian additive model with a discrete uniform prior, the all-or-nothing phenomenon follows as a direct outcome of an appropriately defined near-orthogonality property of the support of the prior distribution.



rate research

Read More

We consider the linear regression problem of estimating a $p$-dimensional vector $beta$ from $n$ observations $Y = X beta + W$, where $beta_j stackrel{text{i.i.d.}}{sim} pi$ for a real-valued distribution $pi$ with zero mean and unit variance, $X_{ij} stackrel{text{i.i.d.}}{sim} mathcal{N}(0,1)$, and $W_istackrel{text{i.i.d.}}{sim} mathcal{N}(0, sigma^2)$. In the asymptotic regime where $n/p to delta$ and $ p/ sigma^2 to mathsf{snr}$ for two fixed constants $delta, mathsf{snr}in (0, infty)$ as $p to infty$, the limiting (normalized) minimum mean-squared error (MMSE) has been characterized by the MMSE of an associated single-letter (additive Gaussian scalar) channel. In this paper, we show that if the MMSE function of the single-letter channel converges to a step function, then the limiting MMSE of estimating $beta$ in the linear regression problem converges to a step function which jumps from $1$ to $0$ at a critical threshold. Moreover, we establish that the limiting mean-squared error of the (MSE-optimal) approximate message passing algorithm also converges to a step function with a larger threshold, providing evidence for the presence of a computational-statistical gap between the two thresholds.
We establish a phase transition known as the all-or-nothing phenomenon for noiseless discrete channels. This class of models includes the Bernoulli group testing model and the planted Gaussian perceptron model. Previously, the existence of the all-or-nothing phenomenon for such models was only known in a limited range of parameters. Our work extends the results to all signals with arbitrary sublinear sparsity. Over the past several years, the all-or-nothing phenomenon has been established in various models as an outcome of two seemingly disjoint results: one positive result establishing the all half of all-or-nothing, and one impossibility result establishing the nothing half. Our main technique in the present work is to show that for noiseless discrete channels, the all half implies the nothing half, that is a proof of all can be turned into a proof of nothing. Since the all half can often be proven by straightforward means -- for instance, by the first-moment method -- our equivalence gives a powerful and general approach towards establishing the existence of this phenomenon in other contexts.
In sparse principal component analysis we are given noisy observations of a low-rank matrix of dimension $ntimes p$ and seek to reconstruct it under additional sparsity assumptions. In particular, we assume here each of the principal components $mathbf{v}_1,dots,mathbf{v}_r$ has at most $s_0$ non-zero entries. We are particularly interested in the high dimensional regime wherein $p$ is comparable to, or even much larger than $n$. In an influential paper, cite{johnstone2004sparse} introduced a simple algorithm that estimates the support of the principal vectors $mathbf{v}_1,dots,mathbf{v}_r$ by the largest entries in the diagonal of the empirical covariance. This method can be shown to identify the correct support with high probability if $s_0le K_1sqrt{n/log p}$, and to fail with high probability if $s_0ge K_2 sqrt{n/log p}$ for two constants $0<K_1,K_2<infty$. Despite a considerable amount of work over the last ten years, no practical algorithm exists with provably better support recovery guarantees. Here we analyze a covariance thresholding algorithm that was recently proposed by cite{KrauthgamerSPCA}. On the basis of numerical simulations (for the rank-one case), these authors conjectured that covariance thresholding correctly recover the support with high probability for $s_0le Ksqrt{n}$ (assuming $n$ of the same order as $p$). We prove this conjecture, and in fact establish a more general guarantee including higher-rank as well as $n$ much smaller than $p$. Recent lower bounds cite{berthet2013computational, ma2015sum} suggest that no polynomial time algorithm can do significantly better. The key technical component of our analysis develops new bounds on the norm of kernel random matrices, in regimes that were not considered before.
In this paper, we study the power iteration algorithm for the spiked tensor model, as introduced in [44]. We give necessary and sufficient conditions for the convergence of the power iteration algorithm. When the power iteration algorithm converges, for the rank one spiked tensor model, we show the estimators for the spike strength and linear functionals of the signal are asymptotically Gaussian; for the multi-rank spiked tensor model, we show the estimators are asymptotically mixtures of Gaussian. This new phenomenon is different from the spiked matrix model. Using these asymptotic results of our estimators, we construct valid and efficient confidence intervals for spike strengths and linear functionals of the signals.
This paper studies the problem of high-dimensional multiple testing and sparse recovery from the perspective of sequential analysis. In this setting, the probability of error is a function of the dimension of the problem. A simple sequential testing procedure is proposed. We derive necessary conditions for reliable recovery in the non-sequential setting and contrast them with sufficient conditions for reliable recovery using the proposed sequential testing procedure. Applications of the main results to several commonly encountered models show that sequential testing can be exponentially more sensitive to the difference between the null and alternative distributions (in terms of the dependence on dimension), implying that subtle cases can be much more reliably determined using sequential methods.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا