No Arabic abstract
Scientists have been interested in estimating causal peer effects to understand how peoples behaviors are affected by their network peers. However, it is well known that identification and estimation of causal peer effects are challenging in observational studies for two reasons. The first is the identification challenge due to unmeasured network confounding, for example, homophily bias and contextual confounding. The second issue is network dependence of observations, which one must take into account for valid statistical inference. Negative control variables, also known as placebo variables, have been widely used in observational studies including peer effect analysis over networks, although they have been used primarily for bias detection. In this article, we establish a formal framework which leverages a pair of negative control outcome and exposure variables (double negative controls) to nonparametrically identify causal peer effects in the presence of unmeasured network confounding. We then propose a generalized method of moments estimator for causal peer effects, and establish its consistency and asymptotic normality under an assumption about $psi$-network dependence. Finally, we provide a network heteroskedasticity and autocorrelation consistent variance estimator. Our methods are illustrated with an application to peer effects in education.
Although the exposure can be randomly assigned in studies of mediation effects, any form of direct intervention on the mediator is often infeasible. As a result, unmeasured mediator-outcome confounding can seldom be ruled out. We propose semiparametric identification of natural direct and indirect effects in the presence of unmeasured mediator-outcome confounding by leveraging heteroskedasticity restrictions on the observed data law. For inference, we develop semiparametric estimators that remain consistent under partial misspecifications of the observed data model. We illustrate the proposed estimators through both simulations and an application to evaluate the effect of self-efficacy on fatigue among health care workers during the COVID-19 outbreak.
Bayesian causal inference offers a principled approach to policy evaluation of proposed interventions on mediators or time-varying exposures. We outline a general approach to the estimation of causal quantities for settings with time-varying confounding, such as exposure-induced mediator-outcome confounders. We further extend this approach to propose two Bayesian data fusion (BDF) methods for unmeasured confounding. Using informative priors on quantities relating to the confounding bias parameters, our methods incorporate data from an external source where the confounder is measured in order to make inferences about causal estimands in the main study population. We present results from a simulation study comparing our data fusion methods to two common frequentist correction methods for unmeasured confounding bias in the mediation setting. We also demonstrate our method with an investigation of the role of stage at cancer diagnosis in contributing to Black-White colorectal cancer survival disparities.
The data drawn from biological, economic, and social systems are often confounded due to the presence of unmeasured variables. Prior work in causal discovery has focused on discrete search procedures for selecting acyclic directed mixed graphs (ADMGs), specifically ancestral ADMGs, that encode ordinary conditional independence constraints among the observed variables of the system. However, confounded systems also exhibit more general equality restrictions that cannot be represented via these graphs, placing a limit on the kinds of structures that can be learned using ancestral ADMGs. In this work, we derive differentiable algebraic constraints that fully characterize the space of ancestral ADMGs, as well as more general classes of ADMGs, arid ADMGs and bow-free ADMGs, that capture all equality restrictions on the observed variables. We use these constraints to cast causal discovery as a continuous optimization problem and design differentiable procedures to find the best fitting ADMG when the data comes from a confounded linear system of equations with correlated errors. We demonstrate the efficacy of our method through simulations and application to a protein expression dataset. Code implementing our methods is open-source and publicly available at https://gitlab.com/rbhatta8/dcd and will be incorporated into the Ananke package.
Data-driven individualized decision making has recently received increasing research interests. Most existing methods rely on the assumption of no unmeasured confounding, which unfortunately cannot be ensured in practice especially in observational studies. Motivated by the recent proposed proximal causal inference, we develop several proximal learning approaches to estimating optimal individualized treatment regimes (ITRs) in the presence of unmeasured confounding. In particular, we establish several identification results for different classes of ITRs, exhibiting the trade-off between the risk of making untestable assumptions and the value function improvement in decision making. Based on these results, we propose several classification-based approaches to finding a variety of restricted in-class optimal ITRs and develop their theoretical properties. The appealing numerical performance of our proposed methods is demonstrated via an extensive simulation study and one real data application.
We develop a new approach for identifying and estimating average causal effects in panel data under a linear factor model with unmeasured confounders. Compared to other methods tackling factor models such as synthetic controls and matrix completion, our method does not require the number of time periods to grow infinitely. Instead, we draw inspiration from the two-way fixed effect model as a special case of the linear factor model, where a simple difference-in-differences transformation identifies the effect. We show that analogous, albeit more complex, transformations exist in the more general linear factor model, providing a new means to identify the effect in that model. In fact many such transformations exist, called bridge functions, all identifying the same causal effect estimand. This poses a unique challenge for estimation and inference, which we solve by targeting the minimal bridge function using a regularized estimation approach. We prove that our resulting average causal effect estimator is root-N consistent and asymptotically normal, and we provide asymptotically valid confidence intervals. Finally, we provide extensions for the case of a linear factor model with time-varying unmeasured confounders.