Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

The All-or-Nothing Phenomenon in Sparse Tensor PCA

171 0 0.0 ( 0 )

Download Cite

Added by Ilias Zadik

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Jonathan Niles-Weed - Ilias Zadik

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We study the statistical problem of estimating a rank-one sparse tensor corrupted by additive Gaussian noise, a model also known as sparse tensor PCA. We show that for Bernoulli and Bernoulli-Rademacher distributed signals and emph{for all} sparsity levels which are sublinear in the dimension of the signal, the sparse tensor PCA model exhibits a phase transition called the emph{all-or-nothing phenomenon}. This is the property that for some signal-to-noise ratio (SNR) $mathrm{SNR_c}$ and any fixed $epsilon>0$, if the SNR of the model is below $left(1-epsilonright)mathrm{SNR_c}$, then it is impossible to achieve any arbitrarily small constant correlation with the hidden signal, while if the SNR is above $left(1+epsilon right)mathrm{SNR_c}$, then it is possible to achieve almost perfect correlation with the hidden signal. The all-or-nothing phenomenon was initially established in the context of sparse linear regression, and over the last year also in the context of sparse 2-tensor (matrix) PCA, Bernoulli group testing, and generalized linear models. Our results follow from a more general result showing that for any Gaussian additive model with a discrete uniform prior, the all-or-nothing phenomenon follows as a direct outcome of an appropriately defined near-orthogonality property of the support of the prior distribution.

rate research

All-or-Nothing Phenomena: From Single-Letter to High Dimensions

73 - Galen Reeves , Jiaming Xu , Ilias Zadik 2019

We consider the linear regression problem of estimating a $p$-dimensional vector $beta$ from $n$ observations $Y = X beta + W$, where $beta_j stackrel{text{i.i.d.}}{sim} pi$ for a real-valued distribution $pi$ with zero mean and unit variance, $X_{ij} stackrel{text{i.i.d.}}{sim} mathcal{N}(0,1)$, and $W_istackrel{text{i.i.d.}}{sim} mathcal{N}(0, sigma^2)$. In the asymptotic regime where $n/p to delta$ and $ p/ sigma^2 to mathsf{snr}$ for two fixed constants $delta, mathsf{snr}in (0, infty)$ as $p to infty$, the limiting (normalized) minimum mean-squared error (MMSE) has been characterized by the MMSE of an associated single-letter (additive Gaussian scalar) channel. In this paper, we show that if the MMSE function of the single-letter channel converges to a step function, then the limiting MMSE of estimating $beta$ in the linear regression problem converges to a step function which jumps from $1$ to $0$ at a critical threshold. Moreover, we establish that the limiting mean-squared error of the (MSE-optimal) approximate message passing algorithm also converges to a step function with a larger threshold, providing evidence for the presence of a computational-statistical gap between the two thresholds.

Statistics Theory Information Theory Information Theory

It was all for nothing: sharp phase transitions for noiseless discrete channels

77 - Jonathan Niles-Weed , Ilias Zadik 2021

We establish a phase transition known as the all-or-nothing phenomenon for noiseless discrete channels. This class of models includes the Bernoulli group testing model and the planted Gaussian perceptron model. Previously, the existence of the all-or-nothing phenomenon for such models was only known in a limited range of parameters. Our work extends the results to all signals with arbitrary sublinear sparsity. Over the past several years, the all-or-nothing phenomenon has been established in various models as an outcome of two seemingly disjoint results: one positive result establishing the all half of all-or-nothing, and one impossibility result establishing the nothing half. Our main technique in the present work is to show that for noiseless discrete channels, the all half implies the nothing half, that is a proof of all can be turned into a proof of nothing. Since the all half can often be proven by straightforward means -- for instance, by the first-moment method -- our equivalence gives a powerful and general approach towards establishing the existence of this phenomenon in other contexts.

Statistics Theory Information Theory Information Theory

Sparse PCA via Covariance Thresholding

519 - Yash Deshpande , Andrea Montanari 2013

In sparse principal component analysis we are given noisy observations of a low-rank matrix of dimension $ntimes p$ and seek to reconstruct it under additional sparsity assumptions. In particular, we assume here each of the principal components $mathbf{v}_1,dots,mathbf{v}_r$ has at most $s_0$ non-zero entries. We are particularly interested in the high dimensional regime wherein $p$ is comparable to, or even much larger than $n$. In an influential paper, cite{johnstone2004sparse} introduced a simple algorithm that estimates the support of the principal vectors $mathbf{v}_1,dots,mathbf{v}_r$ by the largest entries in the diagonal of the empirical covariance. This method can be shown to identify the correct support with high probability if $s_0le K_1sqrt{n/log p}$, and to fail with high probability if $s_0ge K_2 sqrt{n/log p}$ for two constants $0<K_1,K_2<infty$. Despite a considerable amount of work over the last ten years, no practical algorithm exists with provably better support recovery guarantees. Here we analyze a covariance thresholding algorithm that was recently proposed by cite{KrauthgamerSPCA}. On the basis of numerical simulations (for the rank-one case), these authors conjectured that covariance thresholding correctly recover the support with high probability for $s_0le Ksqrt{n}$ (assuming $n$ of the same order as $p$). We prove this conjecture, and in fact establish a more general guarantee including higher-rank as well as $n$ much smaller than $p$. Recent lower bounds cite{berthet2013computational, ma2015sum} suggest that no polynomial time algorithm can do significantly better. The key technical component of our analysis develops new bounds on the norm of kernel random matrices, in regimes that were not considered before.

Statistics Theory Machine Learning Statistics Theory

Power Iteration for Tensor PCA

88 - Jiaoyang Huang , Daniel Z. Huang , Qing Yang 2020

In this paper, we study the power iteration algorithm for the spiked tensor model, as introduced in [44]. We give necessary and sufficient conditions for the convergence of the power iteration algorithm. When the power iteration algorithm converges, for the rank one spiked tensor model, we show the estimators for the spike strength and linear functionals of the signal are asymptotically Gaussian; for the multi-rank spiked tensor model, we show the estimators are asymptotically mixtures of Gaussian. This new phenomenon is different from the spiked matrix model. Using these asymptotic results of our estimators, we construct valid and efficient confidence intervals for spike strengths and linear functionals of the signals.

Statistics Theory Probability Statistics Theory

Sequential Analysis in High Dimensional Multiple Testing and Sparse Recovery

430 - Matthew Malloy , Robert Nowak 2011

This paper studies the problem of high-dimensional multiple testing and sparse recovery from the perspective of sequential analysis. In this setting, the probability of error is a function of the dimension of the problem. A simple sequential testing procedure is proposed. We derive necessary conditions for reliable recovery in the non-sequential setting and contrast them with sufficient conditions for reliable recovery using the proposed sequential testing procedure. Applications of the main results to several commonly encountered models show that sequential testing can be exponentially more sensitive to the difference between the null and alternative distributions (in terms of the dependence on dimension), implying that subtle cases can be much more reliably determined using sequential methods.

Statistics Theory Information Theory Information Theory

comments

Fetching comments

Institut National d'Administration

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

The All-or-Nothing Phenomenon in Sparse Tensor PCA

Ask ChatGPT about the research

No Arabic abstract

Read More