ﻻ يوجد ملخص باللغة العربية
We propose a new method, semi-penalized inference with direct false discovery rate control (SPIDR), for variable selection and confidence interval construction in high-dimensional linear regression. SPIDR first uses a semi-penalized approach to constructing estimators of the regression coefficients. We show that the SPIDR estimator is ideal in the sense that it equals an ideal least squares estimator with high probability under a sparsity and other suitable conditions. Consequently, the SPIDR estimator is asymptotically normal. Based on this distributional result, SPIDR determines the selection rule by directly controlling false discovery rate. This provides an explicit assessment of the selection error. This also naturally leads to confidence intervals for the selected coefficients with a proper confidence statement. We conduct simulation studies to evaluate its finite sample performance and demonstrate its application on a breast cancer gene expression data set. Our simulation studies and data example suggest that SPIDR is a useful method for high-dimensional statistical inference in practice.
Variable selection on the large-scale networks has been extensively studied in the literature. While most of the existing methods are limited to the local functionals especially the graph edges, this paper focuses on selecting the discrete hub struct
We develop a new class of distribution--free multiple testing rules for false discovery rate (FDR) control under general dependence. A key element in our proposal is a symmetrized data aggregation (SDA) approach to incorporating the dependence struct
The highly influential two-group model in testing a large number of statistical hypotheses assumes that the test statistics are drawn independently from a mixture of a high probability null distribution and a low probability alternative. Optimal cont
Selecting relevant features associated with a given response variable is an important issue in many scientific fields. Quantifying quality and uncertainty of a selection result via false discovery rate (FDR) control has been of recent interest. This
Multiple hypothesis testing, a situation when we wish to consider many hypotheses, is a core problem in statistical inference that arises in almost every scientific field. In this setting, controlling the false discovery rate (FDR), which is the expe