Assessing Chronic Kidney Disease from Office Visit Records Using Hierarchical Meta-Classification of an Imbalanced Dataset


Abstract in English

Chronic Kidney Disease (CKD) is an increasingly prevalent condition affecting 13% of the US population. The disease is often a silent condition, making its diagnosis challenging. Identifying CKD stages from standard office visit records can help in early detection of the disease and lead to timely intervention. The dataset we use is highly imbalanced. We propose a hierarchical meta-classification method, aiming to stratify CKD by severity levels, employing simple quantitative non-text features gathered from office visit records, while addressing data imbalance. Our method effectively stratifies CKD severity levels obtaining high average sensitivity, precision and F-measure (~93%). We also conduct experiments in which the dimensionality of the data is significantly reduced to include only the most salient features. Our results show that the good performance of our system is retained even when using the reduced feature sets, as well as under much reduced training sets, indicating that our method is stable and generalizable.

Download