Do you want to publish a course? Click here

A Scan Procedure for Multiple Testing

78   0   0.0 ( 0 )
 Added by Andrew Ying
 Publication date 2018
and research's language is English




Ask ChatGPT about the research

In a multiple testing framework, we propose a method that identifies the interval with the highest estimated false discovery rate of P-values and rejects the corresponding null hypotheses. Unlike the Benjamini-Hochberg method, which does the same but over intervals with an endpoint at the origin, the new procedure `scans all intervals. In parallel with citep*{storey2004strong}, we show that this scan procedure provides strong control of asymptotic false discovery rate. In addition, we investigate its asymptotic false non-discovery rate, deriving conditions under which it outperforms the Benjamini-Hochberg procedure. For example, the scan procedure is superior in power-law location models.



rate research

Read More

We study an online multiple testing problem where the hypotheses arrive sequentially in a stream. The test statistics are independent and assumed to have the same distribution under their respective null hypotheses. We investigate two procedures LORD and LOND, proposed by (Javanmard and Montanari, 2015), which are proved to control the FDR in an online manner. In some (static) model, we show that LORD is optimal in some asymptotic sense, in particular as powerful as the (static) Benjamini-Hochberg procedure to first asymptotic order. We also quantify the performance of LOND. Some numerical experiments complement our theory.
We study a stylized multiple testing problem where the test statistics are independent and assumed to have the same distribution under their respective null hypotheses. We first show that, in the normal means model where the test statistics are normal Z-scores, the well-known method of (Benjamini and Hochberg, 1995) is optimal in some asymptotic sense. We then show that this is also the case of a recent distribution-free method proposed by Foygel-Barber and Cand`es (2015). The method is distribution-free in the sense that it is agnostic to the null distribution - it only requires that the null distribution be symmetric. We extend these optimality results to other location models with a base distribution having fast-decaying tails.
173 - Xueying Tang , Ke Li , Malay Ghosh 2015
This paper considers Bayesian multiple testing under sparsity for polynomial-tailed distributions satisfying a monotone likelihood ratio property. Included in this class of distributions are the Students t, the Pareto, and many other distributions. We prove some general asymptotic optimality results under fixed and random thresholding. As examples of these general results, we establish the Bayesian asymptotic optimality of several multiple testing procedures in the literature for appropriately chosen false discovery rate levels. We also show by simulation that the Benjamini-Hochberg procedure with a false discovery rate level different from the asymptotically optimal one can lead to high Bayes risk.
Classification rules can be severely affected by the presence of disturbing observations in the training sample. Looking for an optimal classifier with such data may lead to unnecessarily complex rules. So, simpler effective classification rules could be achieved if we relax the goal of fitting a good rule for the whole training sample but only consider a fraction of the data. In this paper we introduce a new method based on trimming to produce classification rules with guaranteed performance on a significant fraction of the data. In particular, we provide an automatic way of determining the right trimming proportion and obtain in this setting oracle bounds for the classification error on the new data set.
104 - Ray Bai , Malay Ghosh 2017
We study the well-known problem of estimating a sparse $n$-dimensional unknown mean vector $theta = (theta_1, ..., theta_n)$ with entries corrupted by Gaussian white noise. In the Bayesian framework, continuous shrinkage priors which can be expressed as scale-mixture normal densities are popular for obtaining sparse estimates of $theta$. In this article, we introduce a new fully Bayesian scale-mixture prior known as the inverse gamma-gamma (IGG) prior. We prove that the posterior distribution contracts around the true $theta$ at (near) minimax rate under very mild conditions. In the process, we prove that the sufficient conditions for minimax posterior contraction given by Van der Pas et al. (2016) are not necessary for optimal posterior contraction. We further show that the IGG posterior density concentrates at a rate faster than those of the horseshoe or the horseshoe+ in the Kullback-Leibler (K-L) sense. To classify true signals ($theta_i eq 0$), we also propose a hypothesis test based on thresholding the posterior mean. Taking the loss function to be the expected number of misclassified tests, we show that our test procedure asymptotically attains the optimal Bayes risk exactly. We illustrate through simulations and data analysis that the IGG has excellent finite sample performance for both estimation and classification.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا