Penalized integrative analysis under the accelerated failure time model


Abstract in English

For survival data with high-dimensional covariates, results generated in the analysis of a single dataset are often unsatisfactory because of the small sample size. Integrative analysis pools raw data from multiple independent studies with comparable designs, effectively increases sample size, and has better performance than meta-analysis and single-dataset analysis. In this study, we conduct integrative analysis of survival data under the accelerated failure time (AFT) model. The sparsity structures of multiple datasets are described using the homogeneity and heterogeneity models. For variable selection under the homogeneity model, we adopt group penalization approaches. For variable selection under the heterogeneity model, we use composite penalization and sparse group penalization approaches. As a major advancement from the existing studies, the asymptotic selection and estimation properties are rigorously established. Simulation study is conducted to compare different penalization methods and against alternatives. We also analyze four lung cancer prognosis datasets with gene expression measurements.

Download