Multi-layer Trajectory Clustering: A Network Algorithm for Disease Subtyping


Abstract in English

Many diseases display heterogeneity in clinical features and their progression, indicative of the existence of disease subtypes. Extracting patterns of disease variable progression for subtypes has tremendous application in medicine, for example, in early prognosis and personalized medical therapy. This work present a novel, data-driven, network-based Trajectory Clustering (TC) algorithm for identifying Parkinsons subtypes based on disease trajectory. Modeling patient-variable interactions as a bipartite network, TC first extracts communities of co-expressing disease variables at different stages of progression. Then, it identifies Parkinsons subtypes by clustering similar patient trajectories that are characterized by severity of disease variables through a multi-layer network. Determination of trajectory similarity accounts for direct overlaps between trajectories as well as second-order similarities, i.e., common overlap with a third set of trajectories. This work clusters trajectories across two types of layers: (a) temporal, and (b) ranges of independent outcome variable (representative of disease severity), both of which yield four distinct subtypes. The former subtypes exhibit differences in progression of disease domains (Cognitive, Mental Health etc.), whereas the latter subtypes exhibit different degrees of progression, i.e., some remain mild, whereas others show significant deterioration after 5 years. The TC approach is validated through statistical analyses and consistency of the identified subtypes with medical literature. This generalizable and robust method can easily be extended to other progressive multi-variate disease datasets, and can effectively assist in targeted subtype-specific treatment in the field of personalized medicine.

Download