ﻻ يوجد ملخص باللغة العربية
Small $p$-values are often required to be accurately estimated in large scale genomic studies for the adjustment of multiple hypothesis tests and the ranking of genomic features based on their statistical significance. For those complicated test statistics whose cumulative distribution functions are analytically intractable, existing methods usually do not work well with small $p$-values due to lack of accuracy or computational restrictions. We propose a general approach for accurately and efficiently calculating small $p$-values for a broad range of complicated test statistics based on the principle of the cross-entropy method and Markov chain Monte Carlo sampling techniques. We evaluate the performance of the proposed algorithm through simulations and demonstrate its application to three real examples in genomic studies. The results show that our approach can accurately evaluate small to extremely small $p$-values (e.g. $10^{-6}$ to $10^{-100}$). The proposed algorithm is helpful to the improvement of existing test procedures and the development of new test procedures in genomic studies.
Permutation tests are commonly used for estimating p-values from statistical hypothesis testing when the sampling distribution of the test statistic under the null hypothesis is not available or unreliable for finite sample sizes. One critical challe
Given a family of null hypotheses $H_{1},ldots,H_{s}$, we are interested in the hypothesis $H_{s}^{gamma}$ that at most $gamma-1$ of these null hypotheses are false. Assuming that the corresponding $p$-values are independent, we are investigating com
Accurate real-time tracking of influenza outbreaks helps public health officials make timely and meaningful decisions that could save lives. We propose an influenza tracking model, ARGO (AutoRegression with GOogle search data), that uses publicly ava
We propose a novel method for computing $p$-values based on nested sampling (NS) applied to the sampling space rather than the parameter space of the problem, in contrast to its usage in Bayesian computation. The computational cost of NS scales as $l
Using the growing volumes of vehicle trajectory data, it becomes increasingly possible to capture time-varying and uncertain travel costs in a road network, including travel time and fuel consumption. The current paradigm represents a road network as