ﻻ يوجد ملخص باللغة العربية
We consider the problem of recovering a $k$-sparse signal ${mbox{$beta$}}_0inmathbb{R}^p$ from noisy observations $bf y={bf X}mbox{$beta$}_0+{bf w}inmathbb{R}^n$. One of the most popular approaches is the $l_1$-regularized least squares, also known as LASSO. We analyze the mean square error of LASSO in the case of random designs in which each row of ${bf X}$ is drawn from distribution $N(0,{mbox{$Sigma$}})$ with general ${mbox{$Sigma$}}$. We first derive the asymptotic risk of LASSO in the limit of $n,prightarrowinfty$ with $n/prightarrowdelta$. We then examine conditions on $n$, $p$, and $k$ for LASSO to exactly reconstruct ${mbox{$beta$}}_0$ in the noiseless case ${bf w}=0$. A phase boundary $delta_c=delta(epsilon)$ is precisely established in the phase space defined by $0ledelta,epsilonle 1$, where $epsilon=k/p$. Above this boundary, LASSO perfectly recovers ${mbox{$beta$}}_0$ with high probability. Below this boundary, LASSO fails to recover $mbox{$beta$}_0$ with high probability. While the values of the non-zero elements of ${mbox{$beta$}}_0$ do not have any effect on the phase transition curve, our analysis shows that $delta_c$ does depend on the signed pattern of the nonzero values of $mbox{$beta$}_0$ for general ${mbox{$Sigma$}} e{bf I}_p$. This is in sharp contrast to the previous phase transition results derived in i.i.d. case with $mbox{$Sigma$}={bf I}_p$ where $delta_c$ is completely determined by $epsilon$ regardless of the distribution of $mbox{$beta$}_0$. Underlying our formalism is a recently developed efficient algorithm called approximate message passing (AMP) algorithm. We generalize the state evolution of AMP from i.i.d. case to general case with ${mbox{$Sigma$}} e{bf I}_p$. Extensive computational experiments confirm that our theoretical predictions are consistent with simulation results on moderate size system.
Inferring causal relationships or related associations from observational data can be invalidated by the existence of hidden confounding. We focus on a high-dimensional linear regression setting, where the measured covariates are affected by hidden c
We develop a new class of distribution--free multiple testing rules for false discovery rate (FDR) control under general dependence. A key element in our proposal is a symmetrized data aggregation (SDA) approach to incorporating the dependence struct
Let $G$ be a non--linear function of a Gaussian process ${X_t}_{tinmathbb{Z}}$ with long--range dependence. The resulting process ${G(X_t)}_{tinmathbb{Z}}$ is not Gaussian when $G$ is not linear. We consider random wavelet coefficients associated wit
Least Absolute Shrinkage and Selection Operator or the Lasso, introduced by Tibshirani (1996), is a popular estimation procedure in multiple linear regression when underlying design has a sparse structure, because of its property that it sets some re
The lasso and related sparsity inducing algorithms have been the target of substantial theoretical and applied research. Correspondingly, many results are known about their behavior for a fixed or optimally chosen tuning parameter specified up to unk