ترغب بنشر مسار تعليمي؟ اضغط هنا

The DeCAMFounder: Non-Linear Causal Discovery in the Presence of Hidden Variables

79   0   0.0 ( 0 )
 نشر من قبل Raj Agrawal
 تاريخ النشر 2021
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

Many real-world decision-making tasks require learning casual relationships between a set of variables. Typical causal discovery methods, however, require that all variables are observed, which might not be realistic in practice. Unfortunately, in the presence of latent confounding, recovering casual relationships from observational data without making additional assumptions is an ill-posed problem. Fortunately, in practice, additional structure among the confounders can be expected, one such example being pervasive confounding, which has been exploited for consistent causal estimation in the special case of linear causal models. In this paper, we provide a proof and method to estimate causal relationships in the non-linear, pervasive confounding setting. The heart of our procedure relies on the ability to estimate the pervasive confounding variation through a simple spectral decomposition of the observed data matrix. We derive a DAG score function based on this insight, and empirically compare our method to existing procedures. We show improved performance on both simulated and real datasets by explicitly accounting for both confounders and non-linear effects.

قيم البحث

اقرأ أيضاً

Measurement error in the observed values of the variables can greatly change the output of various causal discovery methods. This problem has received much attention in multiple fields, but it is not clear to what extent the causal model for the meas urement-error-free variables can be identified in the presence of measurement error with unknown variance. In this paper, we study precise sufficient identifiability conditions for the measurement-error-free causal model and show what information of the causal model can be recovered from observed data. In particular, we present two different sets of identifiability conditions, based on the second-order statistics and higher-order statistics of the data, respectively. The former was inspired by the relationship between the generating model of the measurement-error-contaminated data and the factor analysis model, and the latter makes use of the identifiability result of the over-complete independent component analysis problem.
95 - Debo Cheng 2020
Causal effect estimation from observational data is a crucial but challenging task. Currently, only a limited number of data-driven causal effect estimation methods are available. These methods either provide only a bound estimation of the causal eff ect of a treatment on the outcome, or generate a unique estimation of the causal effect, but making strong assumptions on data and having low efficiency. In this paper, we identify a practical problem setting and propose an approach to achieving unique and unbiased estimation of causal effects from data with hidden variables. For the approach, we have developed the theorems to support the discovery of the proper covariate sets for confounding adjustment (adjustment sets). Based on the theorems, two algorithms are proposed for finding the proper adjustment sets from data with hidden variables to obtain unbiased and unique causal effect estimation. Experiments with synthetic datasets generated using five benchmark Bayesian networks and four real-world datasets have demonstrated the efficiency and effectiveness of the proposed algorithms, indicating the practicability of the identified problem setting and the potential of the proposed approach in real-world applications.
In time-to-event settings, the presence of competing events complicates the definition of causal effects. Here we propose the new separable effects to study the causal effect of a treatment on an event of interest. The separable direct effect is the treatment effect on the event of interest not mediated by its effect on the competing event. The separable indirect effect is the treatment effect on the event of interest only through its effect on the competing event. Similar to Robins and Richardsons extended graphical approach for mediation analysis, the separable effects can only be identified under the assumption that the treatment can be decomposed into two distinct components that exert their effects through distinct causal pathways. Unlike existing definitions of causal effects in the presence of competing events, our estimands do not require cross-world contrasts or hypothetical interventions to prevent death. As an illustration, we apply our approach to a randomized clinical trial on estrogen therapy in individuals with prostate cancer.
81 - Debo Cheng 2020
This paper discusses the problem of causal query in observational data with hidden variables, with the aim of seeking the change of an outcome when manipulating a variable while given a set of plausible confounding variables which affect the manipula ted variable and the outcome. Such an experiment on data to estimate the causal effect of the manipulated variable is useful for validating an experiment design using historical data or for exploring confounders when studying a new relationship. However, existing data-driven methods for causal effect estimation face some major challenges, including poor scalability with high dimensional data, low estimation accuracy due to heuristics used by the global causal structure learning algorithms, and the assumption of causal sufficiency when hidden variables are inevitable in data. In this paper, we develop a theorem for using local search to find a superset of the adjustment (or confounding) variables for causal effect estimation from observational data under a realistic pretreatment assumption. The theorem ensures that the unbiased estimate of causal effect is included in the set of causal effects estimated by the superset of adjustment variables. Based on the developed theorem, we propose a data-driven algorithm for causal query. Experiments show that the proposed algorithm is faster and produces better causal effect estimation than an existing data-driven causal effect estimation method with hidden variables. The causal effects estimated by the proposed algorithm are as accurate as those by the state-of-the-art methods using domain knowledge.
Understanding causal relationships is one of the most important goals of modern science. So far, the causal inference literature has focused almost exclusively on outcomes coming from a linear space, most commonly the Euclidean space. However, it is increasingly common that complex datasets collected through electronic sources, such as wearable devices and medical imaging, cannot be represented as data points from linear spaces. In this paper, we present a formal definition of causal effects for outcomes from non-linear spaces, with a focus on the Wasserstein space of cumulative distribution functions. We develop doubly robust estimators and associated asymptotic theory for these causal effects. Our framework extends to outcomes from certain Riemannian manifolds. As an illustration, we use our framework to quantify the causal effect of marriage on physical activity patterns using wearable device data collected through the National Health and Nutrition Examination Survey.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا