No Arabic abstract
Instrumental variables (IVs), sources of treatment randomization that are conditionally independent of the outcome, play an important role in causal inference with unobserved confounders. However, the existing IV-based counterfactual prediction methods need well-predefined IVs, while its an art rather than science to find valid IVs in many real-world scenes. Moreover, the predefined hand-made IVs could be weak or erroneous by violating the conditions of valid IVs. These thorny facts hinder the application of the IV-based counterfactual prediction methods. In this paper, we propose a novel Automatic Instrumental Variable decomposition (AutoIV) algorithm to automatically generate representations serving the role of IVs from observed variables (IV candidates). Specifically, we let the learned IV representations satisfy the relevance condition with the treatment and exclusion condition with the outcome via mutual information maximization and minimization constraints, respectively. We also learn confounder representations by encouraging them to be relevant to both the treatment and the outcome. The IV and confounder representations compete for the information with their constraints in an adversarial game, which allows us to get valid IV representations for IV-based counterfactual prediction. Extensive experiments demonstrate that our method generates valid IV representations for accurate IV-based counterfactual prediction.
In this paper, we focus on the problem of stable prediction across unknown test data, where the test distribution is agnostic and might be totally different from the training one. In such a case, previous machine learning methods might exploit subtly spurious correlations in training data induced by non-causal variables for prediction. Those spurious correlations are changeable across data, leading to instability of prediction across data. By assuming the relationships between causal variables and response variable are invariant across data, to address this problem, we propose a conditional independence test based algorithm to separate those causal variables with a seed variable as priori, and adopt them for stable prediction. By assuming the independence between causal and non-causal variables, we show, both theoretically and with empirical experiments, that our algorithm can precisely separate causal and non-causal variables for stable prediction across test data. Extensive experiments on both synthetic and real-world datasets demonstrate that our algorithm outperforms state-of-the-art methods for stable prediction.
Extending the forecasting time is a critical demand for real applications, such as extreme weather early warning and long-term energy consumption planning. This paper studies the textit{long-term forecasting} problem of time series. Prior Transformer-based models adopt various self-attention mechanisms to discover the long-range dependencies. However, intricate temporal patterns of the long-term future prohibit the model from finding reliable dependencies. Also, Transformers have to adopt the spar
Learning to predict missing links is important for many graph-based applications. Existing methods were designed to learn the observed association between two sets of variables: (1) the observed graph structure and (2) the existence of link between a pair of nodes. However, the causal relationship between these variables was ignored and we visit the possibility of learning it by simply asking a counterfactual question: would the link exist or not if the observed graph structure became different? To answer this question by causal inference, we consider the information of the node pair as context, global graph structural properties as treatment, and link existence as outcome. In this work, we propose a novel link prediction method that enhances graph learning by the counterfactual inference. It creates counterfactual links from the observed ones, and our method learns representations from both of them. Experiments on a number of benchmark datasets show that our proposed method achieves the state-of-the-art performance on link prediction.
A popular way to estimate the causal effect of a variable x on y from observational data is to use an instrumental variable (IV): a third variable z that affects y only through x. The more strongly z is associated with x, the more reliable the estimate is, but such strong IVs are difficult to find. Instead, practitioners combine more commonly available IV candidates---which are not necessarily strong, or even valid, IVs---into a single summary that is plugged into causal effect estimators in place of an IV. In genetic epidemiology, such approaches are known as allele scores. Allele scores require strong assumptions---independence and validity of all IV candidates---for the resulting estimate to be reliable. To relax these assumptions, we propose Ivy, a new method to combine IV candidates that can handle correlated and invalid IV candidates in a robust manner. Theoretically, we characterize this robustness, its limits, and its impact on the resulting causal estimates. Empirically, Ivy can correctly identify the directionality of known relationships and is robust against false discovery (median effect size <= 0.025) on three real-world datasets with no causal effects, while allele scores return more biased estimates (median effect size >= 0.118).
Instrumental variable (IV) regression is a standard strategy for learning causal relationships between confounded treatment and outcome variables from observational data by utilizing an instrumental variable, which affects the outcome only through the treatment. In classical IV regression, learning proceeds in two stages: stage 1 performs linear regression from the instrument to the treatment; and stage 2 performs linear regression from the treatment to the outcome, conditioned on the instrument. We propose a novel method, deep feature instrumental variable regression (DFIV), to address the case where relations between instruments, treatments, and outcomes may be nonlinear. In this case, deep neural nets are trained to define informative nonlinear features on the instruments and treatments. We propose an alternating training regime for these features to ensure good end-to-end performance when composing stages 1 and 2, thus obtaining highly flexible feature maps in a computationally efficient manner. DFIV outperforms recent state-of-the-art methods on challenging IV benchmarks, including settings involving high dimensional image data. DFIV also exhibits competitive performance in off-policy policy evaluation for reinforcement learning, which can be understood as an IV regression task.