A Support Vector Machine Based Cure Rate Model For Interval Censored Data


Abstract in English

The mixture cure rate model is the most commonly used cure rate model in the literature. In the context of mixture cure rate model, the standard approach to model the effect of covariates on the cured or uncured probability is to use a logistic function. This readily implies that the boundary classifying the cured and uncured subjects is linear. In this paper, we propose a new mixture cure rate model based on interval censored data that uses the support vector machine (SVM) to model the effect of covariates on the uncured or the cured probability (i.e., on the incidence part of the model). Our proposed model inherits the features of the SVM and provides flexibility to capture classification boundaries that are non-linear and more complex. Furthermore, the new model can be used to model the effect of covariates on the incidence part when the dimension of covariates is high. The latency part is modeled by a proportional hazards structure. We develop an estimation procedure based on the expectation maximization (EM) algorithm to estimate the cured/uncured probability and the latency model parameters. Our simulation study results show that the proposed model performs better in capturing complex classification boundaries when compared to the existing logistic regression based mixture cure rate model. We also show that our models ability to capture complex classification boundaries improve the estimation results corresponding to the latency parameters. For illustrative purpose, we present our analysis by applying the proposed methodology to an interval censored data on smoking cessation.

Download