No Arabic abstract
Machine learning for healthcare often trains models on de-identified datasets with randomly-shifted calendar dates, ignoring the fact that data were generated under hospital operation practices that change over time. These changing practices induce definitive changes in observed data which confound evaluations which do not account for dates and limit the generalisability of date-agnostic models. In this work, we establish the magnitude of this problem on MIMIC, a public hospital dataset, and showcase a simple solution. We augment MIMIC with the year in which care was provided and show that a model trained using standard feature representations will significantly degrade in quality over time. We find a deterioration of 0.3 AUC when evaluating mortality prediction on data from 10 years later. We find a similar deterioration of 0.15 AUC for length-of-stay. In contrast, we demonstrate that clinically-oriented aggregates of raw features significantly mitigate future deterioration. Our suggested aggregated representations, when retrained yearly, have prediction quality comparable to year-agnostic models.
Despite the intense attention and investment into clinical machine learning (CML) research, relatively few applications convert to clinical practice. While research is important in advancing the state-of-the-art, translation is equally important in bringing these technologies into a position to ultimately impact patient care and live up to extensive expectations surrounding AI in healthcare. To better characterize a holistic perspective among researchers and practitioners, we survey several participants with experience in developing CML for clinical deployment about their learned experiences. We collate these insights and identify several main categories of barriers and pitfalls in order to better design and develop clinical machine learning applications.
Prediction of diabetes and its various complications has been studied in a number of settings, but a comprehensive overview of problem setting for diabetes prediction and care management has not been addressed in the literature. In this document we seek to remedy this omission in literature with an encompassing overview of diabetes complication prediction as well as situating this problem in the context of real world healthcare management. We illustrate various problems encountered in real world clinical scenarios via our own experience with building and deploying such models. In this manuscript we illustrate a Machine Learning (ML) framework for addressing the problem of predicting Type 2 Diabetes Mellitus (T2DM) together with a solution for risk stratification, intervention and management. These ML models align with how physicians think about disease management and mitigation, which comprises these four steps: Identify, Stratify, Engage, Measure.
Sepsis is a life-threatening condition that seriously endangers millions of people over the world. Hopefully, with the widespread availability of electronic health records (EHR), predictive models that can effectively deal with clinical sequential data increase the possibility to predict sepsis and take early preventive treatment. However, the early prediction is challenging because patients sequential data in EHR contains temporal interactions of multiple clinical events. And capturing temporal interactions in the long event sequence is hard for traditional LSTM. Rather than directly applying the LSTM model to the event sequences, our proposed model firstly aggregates heterogeneous clinical events in a short period and then captures temporal interactions of the aggregated representations with LSTM. Our proposed Heterogeneous Event Aggregation can not only shorten the length of clinical event sequence but also help to retain temporal interactions of both categorical and numerical features of clinical events in the multiple heads of the aggregation representations. In the PhysioNet/Computing in Cardiology Challenge 2019, with the team named PKU_DLIB, our proposed model, in high efficiency, achieved utility score (0.321) in the full test set.
Although recent multi-task learning methods have shown to be effective in improving the generalization of deep neural networks, they should be used with caution for safety-critical applications, such as clinical risk prediction. This is because even if they achieve improved task-average performance, they may still yield degraded performance on individual tasks, which may be critical (e.g., prediction of mortality risk). Existing asymmetric multi-task learning methods tackle this negative transfer problem by performing knowledge transfer from tasks with low loss to tasks with high loss. However, using loss as a measure of reliability is risky since it could be a result of overfitting. In the case of time-series prediction tasks, knowledge learned for one task (e.g., predicting the sepsis onset) at a specific timestep may be useful for learning another task (e.g., prediction of mortality) at a later timestep, but lack of loss at each timestep makes it difficult to measure the reliability at each timestep. To capture such dynamically changing asymmetric relationships between tasks in time-series data, we propose a novel temporal asymmetric multi-task learning model that performs knowledge transfer from certain tasks/timesteps to relevant uncertain tasks, based on feature-level uncertainty. We validate our model on multiple clinical risk prediction tasks against various deep learning models for time-series prediction, which our model significantly outperforms, without any sign of negative transfer. Further qualitative analysis of learned knowledge graphs by clinicians shows that they are helpful in analyzing the predictions of the model. Our final code is available at https://github.com/anhtuan5696/TPAMTL.
Acute kidney injury (AKI) in critically ill patients is associated with significant morbidity and mortality. Development of novel methods to identify patients with AKI earlier will allow for testing of novel strategies to prevent or reduce the complications of AKI. We developed data-driven prediction models to estimate the risk of new AKI onset. We generated models from clinical notes within the first 24 hours following intensive care unit (ICU) admission extracted from Medical Information Mart for Intensive Care III (MIMIC-III). From the clinical notes, we generated clinically meaningful word and concept representations and embeddings, respectively. Five supervised learning classifiers and knowledge-guided deep learning architecture were used to construct prediction models. The best configuration yielded a competitive AUC of 0.779. Our work suggests that natural language processing of clinical notes can be applied to assist clinicians in identifying the risk of incident AKI onset in critically ill patients upon admission to the ICU.