Distributional data analysis via quantile functions and its application to modelling digital biomarkers of gait in Alzheimers Disease

Abstract in English

With the advent of continuous health monitoring via wearable devices, users now generate their unique streams of continuous data such as minute-level physical activity or heart rate. Aggregating these streams into scalar summaries ignores the distributional nature of data and often leads to the loss of critical information. We propose to capture the distributional properties of wearable data via user-specific quantile functions that are further used in functional regression and multi-modal distributional modelling. In addition, we propose to encode user-specific distributional information with user-specific L-moments, robust rank-based analogs of traditional moments. Importantly, this L-moment encoding results in mutually consistent functional and distributional interpretation of the results of scalar-on-function regression. We also demonstrate how L-moments can be flexibly employed for analyzing joint and individual sources of variation in multi-modal distributional data. The proposed methods are illustrated in a study of association of accelerometry-derived digital gait biomarkers with Alzheimers disease (AD) and in people with normal cognitive function. Our analysis shows that the proposed quantile-based representation results in a much higher predictive performance compared to simple distributional summaries and attains much stronger associations with clinical cognitive scales.
