ﻻ يوجد ملخص باللغة العربية
We consider the problem of predicting a response $Y$ from a set of covariates $X$ when test and training distributions differ. Since such differences may have causal explanations, we consider test distributions that emerge from interventions in a structural causal model, and focus on minimizing the worst-case risk. Causal regression models, which regress the response on its direct causes, remain unchanged under arbitrary interventions on the covariates, but they are not always optimal in the above sense. For example, for linear models and bounded interventions, alternative solutions have been shown to be minimax prediction optimal. We introduce the formal framework of distribution generalization that allows us to analyze the above problem in partially observed nonlinear models for both direct interventions on $X$ and interventions that occur indirectly via exogenous variables $A$. It takes into account that, in practice, minimax solutions need to be identified from data. Our framework allows us to characterize under which class of interventions the causal function is minimax optimal. We prove sufficient conditions for distribution generalization and present corresponding impossibility results. We propose a practical method, NILE, that achieves distribution generalization in a nonlinear IV setting with linear extrapolation. We prove consistency and present empirical results.
Generalization to out-of-distribution (OOD) data, or domain generalization, is one of the central problems in modern machine learning. Recently, there is a surge of attempts to propose algorithms for OOD that mainly build upon the idea of extracting
In many applications, there is a need to predict the effect of an intervention on different individuals from data. For example, which customers are persuadable by a product promotion? which patients should be treated with a certain type of treatment?
Understanding causal relationships is one of the most important goals of modern science. So far, the causal inference literature has focused almost exclusively on outcomes coming from a linear space, most commonly the Euclidean space. However, it is
Scientists frequently generalize population level causal quantities such as average treatment effect from a source population to a target population. When the causal effects are heterogeneous, differences in subject characteristics between the source
Modern deep neural networks suffer from performance degradation when evaluated on testing data under different distributions from training data. Domain generalization aims at tackling this problem by learning transferable knowledge from multiple sour