No Arabic abstract
When estimating the treatment effect in an observational study, we use a semiparametric locally efficient dimension reduction approach to assess both the treatment assignment mechanism and the average responses in both treated and nontreated groups. We then integrate all results through imputation, inverse probability weighting and doubly robust augmentation estimators. Doubly robust estimators are locally efficient while imputation estimators are super-efficient when the response models are correct. To take advantage of both procedures, we introduce a shrinkage estimator to automatically combine the two, which retains the double robustness property while improving on the variance when the response model is correct. We demonstrate the performance of these estimators through simulated experiments and a real dataset concerning the effect of maternal smoking on baby birth weight. Key words and phrases: Average Treatment Effect, Doubly Robust Estimator, Efficiency, Inverse Probability Weighting, Shrinkage Estimator.
Based on the theory of reproducing kernel Hilbert space (RKHS) and semiparametric method, we propose a new approach to nonlinear dimension reduction. The method extends the semiparametric method into a more generalized domain where both the interested parameters and nuisance parameters to be infinite dimensional. By casting the nonlinear dimensional reduction problem in a generalized semiparametric framework, we calculate the orthogonal complement space of generalized nuisance tangent space to derive the estimating equation. Solving the estimating equation by the theory of RKHS and regularization, we obtain the estimation of dimension reduction directions of the sufficient dimension reduction (SDR) subspace and also show the asymptotic property of estimator. Furthermore, the proposed method does not rely on the linearity condition and constant variance condition. Simulation and real data studies are conducted to demonstrate the finite sample performance of our method in comparison with several existing methods.
SDRcausal is a package that implements sufficient dimension reduction methods for causal inference as proposed in Ghosh, Ma, and de Luna (2021). The package implements (augmented) inverse probability weighting and outcome regression (imputation) estimators of an average treatment effect (ATE) parameter. Nuisance models, both treatment assignment probability given the covariates (propensity score) and outcome regression models, are fitted by using semiparametric locally efficient dimension reduction estimators, thereby allowing for large sets of confounding covariates. Techniques including linear extrapolation, numerical differentiation, and truncation have been used to obtain a practicable implementation of the methods. Finding the suitable dimension reduction map (central mean subspace) requires solving an optimization problem, and several optimization algorithms are given as choices to the user. The package also provides estimators of the asymptotic variances of the causal effect estimators implemented. Plotting options are provided. The core of the methods are implemented in C language, and parallelization is allowed for. The user-friendly and freeware R language is used as interface. The package can be downloaded from Github repository: https://github.com/stat4reg.
Missing data and confounding are two problems researchers face in observational studies for comparative effectiveness. Williamson et al. (2012) recently proposed a unified approach to handle both issues concurrently using a multiply-robust (MR) methodology under the assumption that confounders are missing at random. Their approach considers a union of models in which any submodel has a parametric component while the remaining models are unrestricted. We show that while their estimating function is MR in theory, the possibility for multiply robust inference is complicated by the fact that parametric models for different components of the union model are not variation independent and therefore the MR property is unlikely to hold in practice. To address this, we propose an alternative transparent parametrization of the likelihood function, which makes explicit the model dependencies between various nuisance functions needed to evaluate the MR efficient score. The proposed method is genuinely doubly-robust (DR) in that it is consistent and asymptotic normal if one of two sets of modeling assumptions holds. We evaluate the performance and doubly robust property of the DR method via a simulation study.
Causal effect estimation from observational data is an important but challenging problem. Causal effect estimation with unobserved variables in data is even more difficult. The challenges lie in (1) whether the causal effect can be estimated from observational data (identifiability); (2) accuracy of estimation (unbiasedness), and (3) fast data-driven algorithm for the estimation (efficiency). Each of the above problems by its own, is challenging. There does not exist many data-driven methods for causal effect estimation so far, and they solve one or two of the above problems, but not all. In this paper, we present an algorithm that is fast, unbiased and is able to confirm if a causal effect is identifiable or not under a very practical and commonly seen problem setting. To achieve high efficiency, we approach the causal effect estimation problem as a local search for the minimal adjustment variable sets in data. We have shown that identifiability and unbiased estimation can be both resolved using data in our problem setting, and we have developed theorems to support the local search for searching for adjustment variable sets to achieve unbiased causal effect estimation. We make use of frequent pattern mining strategy to further speed up the search process. Experiments performed on an extensive collection of synthetic and real-world datasets demonstrate that the proposed algorithm outperforms the state-of-the-art causal effect estimation methods in both accuracy and time-efficiency.
Standard Mendelian randomization analysis can produce biased results if the genetic variant defining the instrumental variable (IV) is confounded and/or has a horizontal pleiotropic effect on the outcome of interest not mediated by the treatment. We provide novel identification conditions for the causal effect of a treatment in presence of unmeasured confounding, by leveraging an invalid IV for which both the IV independence and exclusion restriction assumptions may be violated. The proposed Mendelian Randomization Mixed-Scale Treatment Effect Robust Identification (MR MiSTERI) approach relies on (i) an assumption that the treatment effect does not vary with the invalid IV on the additive scale; and (ii) that the selection bias due to confounding does not vary with the invalid IV on the odds ratio scale; and (iii) that the residual variance for the outcome is heteroscedastic and thus varies with the invalid IV. We formally establish that their conjunction can identify a causal effect even with an invalid IV subject to pleiotropy. MiSTERI is shown to be particularly advantageous in presence of pervasive heterogeneity of pleiotropic effects on additive scale, a setting in which two recently proposed robust estimation methods MR GxE and MR GENIUS can be severely biased. In order to incorporate multiple, possibly correlated and weak IVs, a common challenge in MR studies, we develop a MAny Weak Invalid Instruments (MR MaWII MiSTERI) approach for strengthened identification and improved accuracy MaWII MiSTERI is shown to be robust to horizontal pleiotropy, violation of IV independence assumption and weak IV bias. Both simulation studies and real data analysis results demonstrate the robustness of the proposed MR MiSTERI methods.