ترغب بنشر مسار تعليمي؟ اضغط هنا

Detection of Sparse Mixtures: Higher Criticism and Scan Statistic

82   0   0.0 ( 0 )
 نشر من قبل Ery Arias-Castro
 تاريخ النشر 2018
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

We consider the problem of detecting a sparse mixture as studied by Ingster (1997) and Donoho and Jin (2004). We consider a wide array of base distributions. In particular, we study the situation when the base distribution has polynomial tails, a situation that has not received much attention in the literature. Perhaps surprisingly, we find that in the context of such a power-law distribution, the higher criticism does not achieve the detection boundary. However, the scan statistic does.



قيم البحث

اقرأ أيضاً

We obtain an asymptotic expansion for the null distribution function of thegradient statistic for testing composite null hypotheses in the presence of nuisance parameters. The expansion is derived using a Bayesian route based on the shrinkage argumen t described in Ghosh and Mukerjee (1991). Using this expansion, we propose a Bartlett-type corrected gradient statistic with chi-square distribution up to an error of order o(n^{-1}) under the null hypothesis. Further, we also use the expansion to modify the percentage points of the large sample reference chi-square distribution. A small Monte Carlo experiment and various examples are presented and discussed.
We consider two alternative tests to the Higher Criticism test of Donoho and Jin [Ann. Statist. 32 (2004) 962-994] for high-dimensional means under the sparsity of the nonzero means for sub-Gaussian distributed data with unknown column-wise dependenc e. The two alternative test statistics are constructed by first thresholding $L_1$ and $L_2$ statistics based on the sample means, respectively, followed by maximizing over a range of thresholding levels to make the tests adaptive to the unknown signal strength and sparsity. The two alternative tests can attain the same detection boundary of the Higher Criticism test in [Ann. Statist. 32 (2004) 962-994] which was established for uncorrelated Gaussian data. It is demonstrated that the maximal $L_2$-thresholding test is at least as powerful as the maximal $L_1$-thresholding test, and both the maximal $L_2$ and $L_1$-thresholding tests are at least as powerful as the Higher Criticism test.
362 - Andrew Ying , Wen-Xin Zhou 2019
We investigate the asymptotic behavior of several variants of the scan statistic applied to empirical distributions, which can be applied to detect the presence of an anomalous interval with any length. Of particular interest is Studentized scan stat istic that is preferable in practice. The main ingredients in the proof are Kolmogorovs theorem, a Poisson approximation, and recent technical results by Kabluchko et al (2014).
Higher Criticism is a recently developed statistic for non-Gaussian detection, proposed in Donoho & Jin 2004. We find that Higher Criticism is useful for two purposes. First, Higher Criticism has competitive detection power, and non-Gaussianity is de tected at the level 99% in the first year WMAP data. We find that the Higher Criticism value of WMAP is outside the 99% confidence region at a wavelet scale of 5 degrees (99.46% of Higher Criticism values based on simulated maps are below the values for WMAP). Second, Higher Criticism offers a way to locate a small portion of data that accounts for the non-Gaussianity. Using Higher Criticism, we have successfully identified a ring of pixels centered at (lapprox 209 deg, bapprox -57 deg), which seems to account for the observed detection of non-Gaussianity at the wavelet scale of 5 degrees. Note that the detection is achieved in wavelet space first. Second, it is always possible that a fraction of pixels within the ring might deviate from Gaussianity even if they do not appear to be above the 99% confidence level in wavelet space. The location of the ring coincides with the cold spot detected in Vielva et al. 2004 and Cruz et al. 2005.
In a multiple testing framework, we propose a method that identifies the interval with the highest estimated false discovery rate of P-values and rejects the corresponding null hypotheses. Unlike the Benjamini-Hochberg method, which does the same but over intervals with an endpoint at the origin, the new procedure `scans all intervals. In parallel with citep*{storey2004strong}, we show that this scan procedure provides strong control of asymptotic false discovery rate. In addition, we investigate its asymptotic false non-discovery rate, deriving conditions under which it outperforms the Benjamini-Hochberg procedure. For example, the scan procedure is superior in power-law location models.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا