Orthogonal Structure Search for Efficient Causal Discovery from Observational Data

87 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Stefan Bauer

تاريخ النشر 2019

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Anant Raj - Luigi Gresele - Michel Besserve

التعلم الالي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The problem of inferring the direct causal parents of a response variable among a large set of explanatory variables is of high practical importance in many disciplines. Recent work exploits stability of regression coefficients or invariance properties of models across different experimental conditions for reconstructing the full causal graph. These approaches generally do not scale well with the number of the explanatory variables and are difficult to extend to nonlinear relationships. Contrary to existing work, we propose an approach which even works for observational data alone, while still offering theoretical guarantees including the case of partially nonlinear relationships. Our algorithm requires only one estimation for each variable and in our experiments we apply our causal discovery algorithm even to large graphs, demonstrating significant improvements compared to well established approaches.

قيم البحث

101 - Anant Raj , Stefan Bauer , Ashkan Soleymani 2020

The problem of inferring the direct causal parents of a response variable among a large set of explanatory variables is of high practical importance in many disciplines. Recent work in the field of causal discovery exploits invariance properties of m odels across different experimental conditions for detecting direct causal links. However, these approaches generally do not scale well with the number of explanatory variables, are difficult to extend to nonlinear relationships, and require data across different experiments. Inspired by {em Debiased} machine learning methods, we study a one-vs.-the-rest feature selection approach to discover the direct causal parent of the response. We propose an algorithm that works for purely observational data, while also offering theoretical guarantees, including the case of partially nonlinear relationships. Requiring only one estimation for each variable, we can apply our approach even to large graphs, demonstrating significant improvements compared to established approaches.

التعلم الالي التعلم الآلي نظرية الإحصاء

Data Generating Process to Evaluate Causal Discovery Techniques for Time Series Data

80 - Andrew R. Lawrence , Marcus Kaiser , Rui Sampaio 2021

Going beyond correlations, the understanding and identification of causal relationships in observational time series, an important subfield of Causal Discovery, poses a major challenge. The lack of access to a well-defined ground truth for real-world data creates the need to rely on synthetic data for the evaluation of these methods. Existing benchmarks are limited in their scope, as they either are restricted to a static selection of data sets, or do not allow for a granular assessment of the methods performance when commonly made assumptions are violated. We propose a flexible and simple to use framework for generating time series data, which is aimed at developing, evaluating, and benchmarking time series causal discovery methods. In particular, the framework can be used to fine tune novel methods on vast amounts of data, without overfitting them to a benchmark, but rather so they perform well in real-world use cases. Using our framework, we evaluate prominent time series causal discovery methods and demonstrate a notable degradation in performance when their assumptions are invalidated and their sensitivity to choice of hyperparameters. Finally, we propose future research directions and how our framework can support both researchers and practitioners.

التعلم الالي التعلم الآلي

Higher-Order Orthogonal Causal Learning for Treatment Effect

145 - Yiyan Huang , Cheuk Hang Leung , Xing Yan 2021

Most existing studies on the double/debiased machine learning method concentrate on the causal parameter estimation recovering from the first-order orthogonal score function. In this paper, we will construct the $k^{mathrm{th}}$-order orthogonal scor e function for estimating the average treatment effect (ATE) and present an algorithm that enables us to obtain the debiased estimator recovered from the score function. Such a higher-order orthogonal estimator is more robust to the misspecification of the propensity score than the first-order one does. Besides, it has the merit of being applicable with many machine learning methodologies such as Lasso, Random Forests, Neural Nets, etc. We also undergo comprehensive experiments to test the power of the estimator we construct from the score function using both the simulated datasets and the real datasets.

التعلم الالي التعلم الآلي الاقتصاد القياسي

Unsuitability of NOTEARS for Causal Graph Discovery

105 - Marcus Kaiser , Maksim Sipos 2021

Causal Discovery methods aim to identify a DAG structure that represents causal relationships from observational data. In this article, we stress that it is important to test such methods for robustness in practical settings. As our main example, we analyze the NOTEARS method, for which we demonstrate a lack of scale-invariance. We show that NOTEARS is a method that aims to identify a parsimonious DAG from the data that explains the residual variance. We conclude that NOTEARS is not suitable for identifying truly causal relationships from the data.

التعلم الالي التعلم الآلي نظرية الإحصاء

Learning Individual Causal Effects from Networked Observational Data

206 - Ruocheng Guo , Jundong Li , Huan Liu 2019

Convenient access to observational data enables us to learn causal effects without randomized experiments. This research direction draws increasing attention in research areas such as economics, healthcare, and education. For example, we can study ho w a medicine (the treatment) causally affects the health condition (the outcome) of a patient using existing electronic health records. To validate causal effects learned from observational data, we have to control confounding bias -- the influence of variables which causally influence both the treatment and the outcome. Existing work along this line overwhelmingly relies on the unconfoundedness assumption that there do not exist unobserved confounders. However, this assumption is untestable and can even be untenable. In fact, an important fact ignored by the majority of previous work is that observational data can come with network information that can be utilized to infer hidden confounders. For example, in an observational study of the individual-level treatment effect of a medicine, instead of randomized experiments, the medicine is often assigned to each individual based on a series of factors. Some of the factors (e.g., socioeconomic status) can be challenging to measure and therefore become hidden confounders. Fortunately, the socioeconomic status of an individual can be reflected by whom she is connected in social networks. With this fact in mind, we aim to exploit the network information to recognize patterns of hidden confounders which would further allow us to learn valid individual causal effects from observational data. In this work, we propose a novel causal inference framework, the network deconfounder, which learns representations to unravel patterns of hidden confounders from the network information. Empirically, we perform extensive experiments to validate the effectiveness of the network deconfounder on various datasets.

الشبكات الاجتماعية والمعلومات التعلم الآلي