Distribution-free Multiple Testing

109 0 0.0 ( 0 )

Download Cite

Added by Shiyun Chen

Publication date 2016

fields Mathematical Statistics

and research's language is English

Authors Ery Arias-Castro - Shiyun Chen

Statistics Theory Statistics Theory

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We study a stylized multiple testing problem where the test statistics are independent and assumed to have the same distribution under their respective null hypotheses. We first show that, in the normal means model where the test statistics are normal Z-scores, the well-known method of (Benjamini and Hochberg, 1995) is optimal in some asymptotic sense. We then show that this is also the case of a recent distribution-free method proposed by Foygel-Barber and Cand`es (2015). The method is distribution-free in the sense that it is agnostic to the null distribution - it only requires that the null distribution be symmetric. We extend these optimality results to other location models with a base distribution having fast-decaying tails.

rate research

Sequential Multiple Testing

89 - Shiyun Chen , Ery Arias-Castro 2017

We study an online multiple testing problem where the hypotheses arrive sequentially in a stream. The test statistics are independent and assumed to have the same distribution under their respective null hypotheses. We investigate two procedures LORD and LOND, proposed by (Javanmard and Montanari, 2015), which are proved to control the FDR in an online manner. In some (static) model, we show that LORD is optimal in some asymptotic sense, in particular as powerful as the (static) Benjamini-Hochberg procedure to first asymptotic order. We also quantify the performance of LOND. Some numerical experiments complement our theory.

Statistics Theory Statistics Theory

A Scan Procedure for Multiple Testing

77 - Shiyun Chen , Andrew Ying , Ery Arias-Castro 2018

In a multiple testing framework, we propose a method that identifies the interval with the highest estimated false discovery rate of P-values and rejects the corresponding null hypotheses. Unlike the Benjamini-Hochberg method, which does the same but over intervals with an endpoint at the origin, the new procedure `scans all intervals. In parallel with citep*{storey2004strong}, we show that this scan procedure provides strong control of asymptotic false discovery rate. In addition, we investigate its asymptotic false non-discovery rate, deriving conditions under which it outperforms the Benjamini-Hochberg procedure. For example, the scan procedure is superior in power-law location models.

Statistics Theory Statistics Theory

Bayesian Multiple Testing Under Sparsity for Polynomial-Tailed Distributions

173 - Xueying Tang , Ke Li , Malay Ghosh 2015

This paper considers Bayesian multiple testing under sparsity for polynomial-tailed distributions satisfying a monotone likelihood ratio property. Included in this class of distributions are the Students t, the Pareto, and many other distributions. We prove some general asymptotic optimality results under fixed and random thresholding. As examples of these general results, we establish the Bayesian asymptotic optimality of several multiple testing procedures in the literature for appropriately chosen false discovery rate levels. We also show by simulation that the Benjamini-Hochberg procedure with a false discovery rate level different from the asymptotically optimal one can lead to high Bayes risk.

Statistics Theory Statistics Theory

The Inverse Gamma-Gamma Prior for Optimal Posterior Contraction and Multiple Hypothesis Testing

104 - Ray Bai , Malay Ghosh 2017

We study the well-known problem of estimating a sparse $n$-dimensional unknown mean vector $theta = (theta_1, ..., theta_n)$ with entries corrupted by Gaussian white noise. In the Bayesian framework, continuous shrinkage priors which can be expressed as scale-mixture normal densities are popular for obtaining sparse estimates of $theta$. In this article, we introduce a new fully Bayesian scale-mixture prior known as the inverse gamma-gamma (IGG) prior. We prove that the posterior distribution contracts around the true $theta$ at (near) minimax rate under very mild conditions. In the process, we prove that the sufficient conditions for minimax posterior contraction given by Van der Pas et al. (2016) are not necessary for optimal posterior contraction. We further show that the IGG posterior density concentrates at a rate faster than those of the horseshoe or the horseshoe+ in the Kullback-Leibler (K-L) sense. To classify true signals ($theta_i eq 0$), we also propose a hypothesis test based on thresholding the posterior mean. Taking the loss function to be the expected number of misclassified tests, we show that our test procedure asymptotically attains the optimal Bayes risk exactly. We illustrate through simulations and data analysis that the IGG has excellent finite sample performance for both estimation and classification.

Statistics Theory Statistics Theory

Distribution-free consistent independence tests via center-outward ranks and signs

179 - Hongjian Shi , Mathias Drton , 2019

This paper investigates the problem of testing independence of two random vectors of general dimensions. For this, we give for the first time a distribution-free consistent test. Our approach combines distance covariance with the center-outward ranks and signs developed in Hallin (2017). In technical terms, the proposed test is consistent and distribution-free in the family of multivariate distributions with nonvanishing (Lebesgue) probability densities. Exploiting the (degenerate) U-statistic structure of the distance covariance and the combinatorial nature of Hallins center-outward ranks and signs, we are able to derive the limiting null distribution of our test statistic. The resulting asymptotic approximation is accurate already for moderate sample sizes and makes the test implementable without requiring permutation. The limiting distribution is derived via a more general result that gives a new type of combinatorial non-central limit theorem for double- and multiple-indexed permutation statistics.

Statistics Theory Statistics Theory