Clinical Prediction Models to Predict the Risk of Multiple Binary Outcomes: a comparison of approaches


Abstract in English

Clinical prediction models (CPMs) are used to predict clinically relevant outcomes or events. Typically, prognostic CPMs are derived to predict the risk of a single future outcome. However, with rising emphasis on the prediction of multi-morbidity, there is growing need for CPMs to simultaneously predict risks for each of multiple future outcomes. A common approach to multi-outcome risk prediction is to derive a CPM for each outcome separately, then multiply the predicted risks. This approach is only valid if the outcomes are conditionally independent given the covariates, and it fails to exploit the potential relationships between the outcomes. This paper outlines several approaches that could be used to develop prognostic CPMs for multiple outcomes. We consider four methods, ranging in complexity and assumed conditional independence assumptions: namely, probabilistic classifier chain, multinomial logistic regression, multivariate logistic regression, and a Bayesian probit model. These are compared with methods that rely on conditional independence: separate univariate CPMs and stacked regression. Employing a simulation study and real-world example via the MIMIC-III database, we illustrate that CPMs for joint risk prediction of multiple outcomes should only be derived using methods that model the residual correlation between outcomes. In such a situation, our results suggest that probabilistic classification chains, multinomial logistic regression or the Bayesian probit model are all appropriate choices. We call into question the development of CPMs for each outcome in isolation when multiple correlated or structurally related outcomes are of interest and recommend more holistic risk prediction.

Download