Post-selection estimation and testing following aggregated association tests


Abstract in English

The practice of pooling several individual test statistics to form aggregate tests is common in many statistical application where individual tests may be underpowered. While selection by aggregate tests can serve to increase power, the selection process invalidates the individual test-statistics, making it difficult to identify the ones that drive the signal in follow-up inference. Here, we develop a general approach for valid inference following selection by aggregate testing. We present novel powerful post-selection tests for the individual null hypotheses which are exact for the normal model and asymptotically justified otherwise. Our approach relies on the ability to characterize the distribution of the individual test statistics after conditioning on the event of selection. We provide efficient algorithms for estimation of the post-selection maximum-likelihood estimates and suggest confidence intervals which rely on a novel switching regime for good coverage guarantees. We validate our methods via comprehensive simulation studies and apply them to data from the Dallas Heart Study, demonstrating that single variant association discovery following selection by an aggregated test is indeed possible in practice.

Download