No Arabic abstract
We consider two alternative tests to the Higher Criticism test of Donoho and Jin [Ann. Statist. 32 (2004) 962-994] for high-dimensional means under the sparsity of the nonzero means for sub-Gaussian distributed data with unknown column-wise dependence. The two alternative test statistics are constructed by first thresholding $L_1$ and $L_2$ statistics based on the sample means, respectively, followed by maximizing over a range of thresholding levels to make the tests adaptive to the unknown signal strength and sparsity. The two alternative tests can attain the same detection boundary of the Higher Criticism test in [Ann. Statist. 32 (2004) 962-994] which was established for uncorrelated Gaussian data. It is demonstrated that the maximal $L_2$-thresholding test is at least as powerful as the maximal $L_1$-thresholding test, and both the maximal $L_2$ and $L_1$-thresholding tests are at least as powerful as the Higher Criticism test.
We consider the problem of detecting a sparse mixture as studied by Ingster (1997) and Donoho and Jin (2004). We consider a wide array of base distributions. In particular, we study the situation when the base distribution has polynomial tails, a situation that has not received much attention in the literature. Perhaps surprisingly, we find that in the context of such a power-law distribution, the higher criticism does not achieve the detection boundary. However, the scan statistic does.
This paper is to prove the asymptotic normality of a statistic for detecting the existence of heteroscedasticity for linear regression models without assuming randomness of covariates when the sample size $n$ tends to infinity and the number of covariates $p$ is either fixed or tends to infinity. Moreover our approach indicates that its asymptotic normality holds even without homoscedasticity.
Let $(Y,(X_i)_{iinmathcal{I}})$ be a zero mean Gaussian vector and $V$ be a subset of $mathcal{I}$. Suppose we are given $n$ i.i.d. replications of the vector $(Y,X)$. We propose a new test for testing that $Y$ is independent of $(X_i)_{iin mathcal{I}backslash V}$ conditionally to $(X_i)_{iin V}$ against the general alternative that it is not. This procedure does not depend on any prior information on the covariance of $X$ or the variance of $Y$ and applies in a high-dimensional setting. It straightforwardly extends to test the neighbourhood of a Gaussian graphical model. The procedure is based on a model of Gaussian regression with random Gaussian covariates. We give non asymptotic properties of the test and we prove that it is rate optimal (up to a possible $log(n)$ factor) over various classes of alternatives under some additional assumptions. Besides, it allows us to derive non asymptotic minimax rates of testing in this setting. Finally, we carry out a simulation study in order to evaluate the performance of our procedure.
In this paper, we estimate the high dimensional precision matrix under the weak sparsity condition where many entries are nearly zero. We study a Lasso-type method for high dimensional precision matrix estimation and derive general error bounds under the weak sparsity condition. The common irrepresentable condition is relaxed and the results are applicable to the weak sparse matrix. As applications, we study the precision matrix estimation for the heavy-tailed data, the non-paranormal data, and the matrix data with the Lasso-type method.
This paper considers Bayesian multiple testing under sparsity for polynomial-tailed distributions satisfying a monotone likelihood ratio property. Included in this class of distributions are the Students t, the Pareto, and many other distributions. We prove some general asymptotic optimality results under fixed and random thresholding. As examples of these general results, we establish the Bayesian asymptotic optimality of several multiple testing procedures in the literature for appropriately chosen false discovery rate levels. We also show by simulation that the Benjamini-Hochberg procedure with a false discovery rate level different from the asymptotically optimal one can lead to high Bayes risk.