Simultaneous Model Selection and Estimation for Mean and Association Structures with Clustered Binary Data


Abstract in English

This paper investigates the property of the penalized estimating equations when both the mean and association structures are modelled. To select variables for the mean and association structures sequentially, we propose a hierarchical penalized generalized estimating equations (HPGEE2) approach. The first set of penalized estimating equations is solved for the selection of significant mean parameters. Conditional on the selected mean model, the second set of penalized estimating equations is solved for the selection of significant association parameters. The hierarchical approach is designed to accommodate possible model constraints relating the inclusion of covariates into the mean and the association models. This two-step penalization strategy enjoys a compelling advantage of easing computational burdens compared to solving the two sets of penalized equations simultaneously. HPGEE2 with a smoothly clipped absolute deviation (SCAD) penalty is shown to have the oracle property for the mean and association models. The asymptotic behavior of the penalized estimator under this hierarchical approach is established. An efficient two-stage penalized weighted least square algorithm is developed to implement the proposed method. The empirical performance of the proposed HPGEE2 is demonstrated through Monte-Carlo studies and the analysis of a clinical data set.

Download