Combining independent p-values in replicability analysis: A comparative study


Abstract in English

Given a family of null hypotheses $H_{1},ldots,H_{s}$, we are interested in the hypothesis $H_{s}^{gamma}$ that at most $gamma-1$ of these null hypotheses are false. Assuming that the corresponding $p$-values are independent, we are investigating combined $p$-values that are valid for testing $H_{s}^{gamma}$. In various settings in which $H_{s}^{gamma}$ is false, we determine which combined $p$-value works well in which setting. Via simulations, we find that the Stouffer method works well if the null $p$-values are uniformly distributed and the signal strength is low, and the Fisher method works better if the null $p$-values are conservative, i.e. stochastically larger than the uniform distribution. The minimum method works well if the evidence for the rejection of $H_{s}^{gamma}$ is focused on only a few non-null $p$-values, especially if the null $p$-values are conservative. Methods that incorporate the combination of $e$-values work well if the null hypotheses $H_{1},ldots,H_{s}$ are simple.

Download