ترغب بنشر مسار تعليمي؟ اضغط هنا

Causal Inference with Invalid Instruments: Post-selection Problems and A Solution Using Searching and Sampling

249   0   0.0 ( 0 )
 نشر من قبل Zijian Guo
 تاريخ النشر 2021
  مجال البحث الاحصاء الرياضي
والبحث باللغة English
 تأليف Zijian Guo




اسأل ChatGPT حول البحث

Instrumental variable methods are among the most commonly used causal inference approaches to account for unmeasured confounders in observational studies. The presence of invalid instruments is a major concern for practical applications and a fast-growing area of research is inference for the causal effect with possibly invalid instruments. The existing inference methods rely on correctly separating valid and invalid instruments in a data dependent way. In this paper, we illustrate post-selection problems of these existing methods. We construct uniformly valid confidence intervals for the causal effect, which are robust to the mistakes in separating valid and invalid instruments. Our proposal is to search for the causal effect such that a sufficient amount of candidate instruments can be taken as valid. We further devise a novel sampling method, which, together with searching, lead to a more precise confidence interval. Our proposed searching and sampling confidence intervals are shown to be uniformly valid under the finite-sample majority and plurality rules. We compare our proposed methods with existing inference methods over a large set of simulation studies and apply them to study the effect of the triglyceride level on the glucose level over a mouse data set.



قيم البحث

اقرأ أيضاً

Instrumental variable methods provide a powerful approach to estimating causal effects in the presence of unobserved confounding. But a key challenge when applying them is the reliance on untestable exclusion assumptions that rule out any relationshi p between the instrument variable and the response that is not mediated by the treatment. In this paper, we show how to perform consistent IV estimation despite violations of the exclusion assumption. In particular, we show that when one has multiple candidate instruments, only a majority of these candidates---or, more generally, the modal candidate-response relationship---needs to be valid to estimate the causal effect. Our approach uses an estimate of the modal prediction from an ensemble of instrumental variable estimators. The technique is simple to apply and is black-box in the sense that it may be used with any instrumental variable estimator as long as the treatment effect is identified for each valid instrument independently. As such, it is compatible with recent machine-learning based estimators that allow for the estimation of conditional average treatment effects (CATE) on complex, high dimensional data. Experimentally, we achieve accurate estimates of conditional average treatment effects using an ensemble of deep network-based estimators, including on a challenging simulated Mendelian Randomization problem.
Mendelian randomization (MR) has become a popular approach to study causal effects by using genetic variants as instrumental variables. We propose a new MR method, GENIUS-MAWII, which simultaneously addresses the two salient phenomena that adversely affect MR analyses: many weak instruments and widespread horizontal pleiotropy. Similar to MR GENIUS citep{Tchetgen2019_GENIUS}, we achieve identification of the treatment effect by leveraging heteroscedasticity of the exposure. We then derive the class of influence functions of the treatment effect, based on which, we construct a continuous updating estimator and establish its consistency and asymptotic normality under a many weak invalid instruments asymptotic regime by developing novel semiparametric theory. We also provide a measure of weak identification and graphical diagnostic tool. We demonstrate in simulations that GENIUS-MAWII has clear advantages in the presence of directional or correlated horizontal pleiotropy compared to other methods. We apply our method to study the effect of body mass index on systolic blood pressure using UK Biobank.
148 - Yifan Cui , Hongming Pu , Xu Shi 2020
Skepticism about the assumption of no unmeasured confounding, also known as exchangeability, is often warranted in making causal inferences from observational data; because exchangeability hinges on an investigators ability to accurately measure cova riates that capture all potential sources of confounding. In practice, the most one can hope for is that covariate measurements are at best proxies of the true underlying confounding mechanism operating in a given observational study. In this paper, we consider the framework of proximal causal inference introduced by Tchetgen Tchetgen et al. (2020), which while explicitly acknowledging covariate measurements as imperfect proxies of confounding mechanisms, offers an opportunity to learn about causal effects in settings where exchangeability on the basis of measured covariates fails. We make a number of contributions to proximal inference including (i) an alternative set of conditions for nonparametric proximal identification of the average treatment effect; (ii) general semiparametric theory for proximal estimation of the average treatment effect including efficiency bounds for key semiparametric models of interest; (iii) a characterization of proximal doubly robust and locally efficient estimators of the average treatment effect. Moreover, we provide analogous identification and efficiency results for the average treatment effect on the treated. Our approach is illustrated via simulation studies and a data application on evaluating the effectiveness of right heart catheterization in the intensive care unit of critically ill patients.
288 - Kangjie Zhou , Jinzhu Jia 2021
Propensity score methods have been shown to be powerful in obtaining efficient estimators of average treatment effect (ATE) from observational data, especially under the existence of confounding factors. When estimating, deciding which type of covari ates need to be included in the propensity score function is important, since incorporating some unnecessary covariates may amplify both bias and variance of estimators of ATE. In this paper, we show that including additional instrumental variables that satisfy the exclusion restriction for outcome will do harm to the statistical efficiency. Also, we prove that, controlling for covariates that appear as outcome predictors, i.e. predict the outcomes and are irrelevant to the exposures, can help reduce the asymptotic variance of ATE estimation. We also note that, efficiently estimating the ATE by non-parametric or semi-parametric methods require the estimated propensity score function, as described in Hirano et al. (2003)cite{Hirano2003}. Such estimation procedure usually asks for many regularity conditions, Rothe (2016)cite{Rothe2016} also illustrated this point and proposed a known propensity score (KPS) estimator that requires mild regularity conditions and is still fully efficient. In addition, we introduce a linearly modified (LM) estimator that is nearly efficient in most general settings and need not estimation of the propensity score function, hence convenient to calculate. The construction of this estimator borrows idea from the interaction estimator of Lin (2013)cite{Lin2013}, in which regression adjustment with interaction terms are applied to deal with data arising from a completely randomized experiment. As its name suggests, the LM estimator can be viewed as a linear modification on the IPW estimator using known propensity scores. We will also investigate its statistical properties both analytically and numerically.
We propose novel estimators for categorical and continuous treatments by using an optimal covariate balancing strategy for inverse probability weighting. The resulting estimators are shown to be consistent and asymptotically normal for causal contras ts of interest, either when the model explaining treatment assignment is correctly specified, or when the correct set of bases for the outcome models has been chosen and the assignment model is sufficiently rich. For the categorical treatment case, we show that the estimator attains the semiparametric efficiency bound when all models are correctly specified. For the continuous case, the causal parameter of interest is a function of the treatment dose. The latter is not parametrized and the estimators proposed are shown to have bias and variance of the classical nonparametric rate. Asymptotic results are complemented with simulations illustrating the finite sample properties. Our analysis of a data set suggests a nonlinear effect of BMI on the decline in self reported health.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا