No Arabic abstract
In the field of disparities research, there has been growing interest in developing a counterfactual-based decomposition analysis to identify underlying mediating mechanisms that help reduce disparities in populations. Despite rapid development in the area, most prior studies have been limited to regression-based methods, undermining the possibility of addressing complex models with multiple mediators and/or heterogeneous effects. We propose an estimation method that effectively addresses complex models. Moreover, we develop a novel sensitivity analysis for possible violations of identification assumptions. The proposed method and sensitivity analysis are demonstrated with data from the Midlife Development in the US study to investigate the degree to which disparities in cardiovascular health at the intersection of race and gender would be reduced if the distributions of education and perceived discrimination were the same across intersectional groups.
While a randomized controlled trial (RCT) readily measures the average treatment effect (ATE), this measure may need to be generalized to the target population to account for a sampling bias in the RCTs population. Identifying this target population treatment effect needs covariates in both sets to capture all treatment effect modifiers that are shifted between the two sets. Standard estimators then use either weighting (IPSW), outcome modeling (G-formula), or combine the two in doubly robust approaches (AIPSW). However such covariates are often not available in both sets. Therefore, after completing existing proofs on the complete case consistency of those three estimators, we compute the expected bias induced by a missing covariate, assuming a Gaussian distribution and a semi-parametric linear model. This enables sensitivity analysis for each missing covariate pattern, giving the sign of the expected bias. We also show that there is no gain in imputing a partially-unobserved covariate. Finally we study the replacement of a missing covariate by a proxy. We illustrate all these results on simulations, as well as semi-synthetic benchmarks using data from the Tennessee Student/Teacher Achievement Ratio (STAR), and with a real-world example from critical care medicine.
Causal variance decompositions for a given disease-specific quality indicator can be used to quantify differences in performance between hospitals or health care providers. While variance decompositions can demonstrate variation in quality of care, causal mediation analysis can be used to study care pathways leading to the differences in performance between the institutions. This raises the question of whether the two approaches can be combined to decompose between-hospital variation in an outcome type indicator to that mediated through a given process (indirect effect) and remaining variation due to all other pathways (direct effect). For this purpose, we derive a causal mediation analysis decomposition of between-hospital variance, discuss its interpretation, and propose an estimation approach based on generalized linear mixed models for the outcome and the mediator. We study the performance of the estimators in a simulation study and demonstrate its use in administrative data on kidney cancer care in Ontario.
Sensitivity indices when the inputs of a model are not independent are estimated by local polynomial techniques. Two original estimators based on local polynomial smoothers are proposed. Both have good theoretical properties which are exhibited and also illustrated through analytical examples. They are used to carry out a sensitivity analysis on a real case of a kinetic model with correlated parameters.
Causal effect estimation from observational data is an important but challenging problem. Causal effect estimation with unobserved variables in data is even more difficult. The challenges lie in (1) whether the causal effect can be estimated from observational data (identifiability); (2) accuracy of estimation (unbiasedness), and (3) fast data-driven algorithm for the estimation (efficiency). Each of the above problems by its own, is challenging. There does not exist many data-driven methods for causal effect estimation so far, and they solve one or two of the above problems, but not all. In this paper, we present an algorithm that is fast, unbiased and is able to confirm if a causal effect is identifiable or not under a very practical and commonly seen problem setting. To achieve high efficiency, we approach the causal effect estimation problem as a local search for the minimal adjustment variable sets in data. We have shown that identifiability and unbiased estimation can be both resolved using data in our problem setting, and we have developed theorems to support the local search for searching for adjustment variable sets to achieve unbiased causal effect estimation. We make use of frequent pattern mining strategy to further speed up the search process. Experiments performed on an extensive collection of synthetic and real-world datasets demonstrate that the proposed algorithm outperforms the state-of-the-art causal effect estimation methods in both accuracy and time-efficiency.
In causal mediation studies that decompose an average treatment effect into a natural indirect effect (NIE) and a natural direct effect (NDE), examples of post-treatment confounding are abundant. Past research has generally considered it infeasible to adjust for a post-treatment confounder of the mediator-outcome relationship due to incomplete information: it is observed under the actual treatment condition while missing under the counterfactual treatment condition. This study proposes a new sensitivity analysis strategy for handling post-treatment confounding and incorporates it into weighting-based causal mediation analysis without making extra identification assumptions. Under the sequential ignorability of the treatment assignment and of the mediator, we obtain the conditional distribution of the post-treatment confounder under the counterfactual treatment as a function of not just pretreatment covariates but also its counterpart under the actual treatment. The sensitivity analysis then generates a bound for the NIE and that for the NDE over a plausible range of the conditional correlation between the post-treatment confounder under the actual and that under the counterfactual conditions. Implemented through either imputation or integration, the strategy is suitable for binary as well as continuous measures of post-treatment confounders. Simulation results demonstrate major strengths and potential limitations of this new solution. A re-analysis of the National Evaluation of Welfare-to-Work Strategies (NEWWS) Riverside data reveals that the initial analytic results are sensitive to omitted post-treatment confounding.