Modelling fertility potential in survivors of childhood cancer: An introduction to modern statistical and computational methods


Abstract in English

Statistical and computational methods are widely used in todays scientific studies. Using a female fertility potential in childhood cancer survivors as an example, we illustrate how these methods can be used to extract insight regarding biological processes from noisy observational data in order to inform decision making. We start by contextualizing the computational methods with the working example: the modelling of acute ovarian failure risk in female childhood cancer survivors to quantify the risk of permanent ovarian failure due to exposure to lifesaving but nonetheless toxic cancer treatments. This is followed by a description of the general framework of classification problems. We provide an overview of the modelling algorithms employed in our example, including one classic model (logistic regression) and two popular modern learning methods (random forest and support vector machines). Using the working example, we show the general steps of data preparation for modelling, variable selection steps for the classic model, and how model performance might be improved utilizing visualization tools. We end with a note on the importance of model evaluation.

Download