Causal Inference on Non-linear Spaces: Distribution Functions and Beyond


Abstract in English

Understanding causal relationships is one of the most important goals of modern science. So far, the causal inference literature has focused almost exclusively on outcomes coming from a linear space, most commonly the Euclidean space. However, it is increasingly common that complex datasets collected through electronic sources, such as wearable devices and medical imaging, cannot be represented as data points from linear spaces. In this paper, we present a formal definition of causal effects for outcomes from non-linear spaces, with a focus on the Wasserstein space of cumulative distribution functions. We develop doubly robust estimators and associated asymptotic theory for these causal effects. Our framework extends to outcomes from certain Riemannian manifolds. As an illustration, we use our framework to quantify the causal effect of marriage on physical activity patterns using wearable device data collected through the National Health and Nutrition Examination Survey.

Download