Do you want to publish a course? Click here

Information-theoretically Optimal Sparse PCA

453   0   0.0 ( 0 )
 Added by Yash Deshpande
 Publication date 2014
and research's language is English




Ask ChatGPT about the research

Sparse Principal Component Analysis (PCA) is a dimensionality reduction technique wherein one seeks a low-rank representation of a data matrix with additional sparsity constraints on the obtained representation. We consider two probabilistic formulations of sparse PCA: a spiked Wigner and spiked Wishart (or spiked covariance) model. We analyze an Approximate Message Passing (AMP) algorithm to estimate the underlying signal and show, in the high dimensional limit, that the AMP estimates are information-theoretically optimal. As an immediate corollary, our results demonstrate that the posterior expectation of the underlying signal, which is often intractable to compute, can be obtained using a polynomial-time scheme. Our results also effectively provide a single-letter characterization of the sparse PCA problem.



rate research

Read More

We study optimal estimation for sparse principal component analysis when the number of non-zero elements is small but on the same order as the dimension of the data. We employ approximate message passing (AMP) algorithm and its state evolution to analyze what is the information theoretically minimal mean-squared error and the one achieved by AMP in the limit of large sizes. For a special case of rank one and large enough density of non-zeros Deshpande and Montanari [1] proved that AMP is asymptotically optimal. We show that both for low density and for large rank the problem undergoes a series of phase transitions suggesting existence of a region of parameters where estimation is information theoretically possible, but AMP (and presumably every other polynomial algorithm) fails. The analysis of the large rank limit is particularly instructive.
We study the problem of recovering a block-sparse signal from under-sampled observations. The non-zero values of such signals appear in few blocks, and their recovery is often accomplished using a $ell_{1,2}$ optimization problem. In applications such as DNA micro-arrays, some prior information about the block support, i.e., blocks containing non-zero elements, is available. A typical way to consider the extra information in recovery procedures is to solve a weighted $ell_{1,2}$ problem. In this paper, we consider a block sparse model, where the block support has intersection with some given subsets of blocks with known probabilities. Our goal in this work is to minimize the number of required linear Gaussian measurements for perfect recovery of the signal by tuning the weights of a weighted $ell_{1,2}$ problem. For this goal, we apply tools from conic integral geometry and derive closed-form expressions for the optimal weights. We show through precise analysis and simulations that the weighted $ell_{1,2}$ problem with optimal weights significantly outperforms the regular $ell_{1,2}$ problem. We further examine the sensitivity of the optimal weights to the mismatch of block probabilities, and conclude stability under small probability deviations.
In this paper study the problem of signal detection in Gaussian noise in a distributed setting. We derive a lower bound on the size that the signal needs to have in order to be detectable. Moreover, we exhibit optimal distributed testing strategies that attain the lower bound.
71 - Lin Zhou , Yun Wei , Alfred Hero 2020
We revisit the universal outlier hypothesis testing (Li emph{et al.}, TIT 2014) and derive fundamental limits for the optimal test. In outlying hypothesis testing, one is given multiple observed sequences, where most sequences are generated i.i.d. from a nominal distribution. The task is to discern the set of outlying sequences that are generated according to anomalous distributions. The nominal and anomalous distributions are emph{unknown}. We study the tradeoff among the probabilities of misclassification error, false alarm and false reject for tests that satisfy weak conditions on the rate of decrease of these error probabilities as a function of sequence length. Specifically, we propose a threshold-based universal test that ensures exponential decay of misclassification error and false alarm probabilities. We study two constraints on the false reject probabilities, one is that it be a non-vanishing constant and the other is that it have an exponential decay rate. For both cases, we characterize bounds on the false reject probability, as a function of the threshold, for each pair of nominal and anomalous distributions and demonstrate the optimality of our test in the generalized Neyman-Pearson sense. We first consider the case of at most one outlier and then generalize our results to the case of multiple outliers where the number of outliers is unknown and each outlier can follow a different anomalous distribution.
This work studies the recursive robust principal components analysis (PCA) problem. Here, robust refers to robustness to both independent and correlated sparse outliers, although we focus on the latter. A key application where this problem occurs is in video surveillance where the goal is to separate a slowly changing background from moving foreground objects on-the-fly. The background sequence is well modeled as lying in a low dimensional subspace, that can gradually change over time, while the moving foreground objects constitute the correlated sparse outliers. In this and many other applications, the foreground is an outlier for PCA but is actually the signal of interest for the application; where as the background is the corruption or noise. Thus our problem can also be interpreted as one of recursively recovering a time sequence of sparse signals in the presence of large but spatially correlated noise. This work has two key contributions. First, we provide a new way of looking at this problem and show how a key part of our solution strategy involves solving a noisy compressive sensing (CS) problem. Second, we show how we can utilize the correlation of the outliers to our advantage in order to even deal with very large support sized outliers. The main idea is as follows. The correlation model applied to the previous support estimate helps predict the current support. This prediction serves as partial support knowledge for solving the modified-CS problem instead of CS. The support estimate of the modified-CS reconstruction is, in turn, used to update the correlation model parameters using a Kalman filter (or any adaptive filter). We call the resulting approach support-predicted modified-CS.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا