No Arabic abstract
Instrumental variables (IV) are a useful tool for estimating causal effects in the presence of unmeasured confounding. IV methods are well developed for uncensored outcomes, particularly for structural linear equation models, where simple two-stage estimation schemes are available. The extension of these methods to survival settings is challenging, partly because of the nonlinearity of the popular survival regression models and partly because of the complications associated with right censoring or other survival features. We develop a simple causal hazard ratio estimator in a proportional hazards model with right censored data. The method exploits a special characterization of IV which enables the use of an intuitive inverse weighting scheme that is generally applicable to more complex survival settings with left truncation, competing risks, or recurrent events. We rigorously establish the asymptotic properties of the estimators, and provide plug-in variance estimators. The proposed method can be implemented in standard software, and is evaluated through extensive simulation studies. We apply the proposed IV method to a data set from the Prostate, Lung, Colorectal and Ovarian cancer screening trial to delineate the causal effect of flexible sigmoidoscopy screening on colorectal cancer survival which may be confounded by informative noncompliance with the assigned screening regimen.
Instrumental variables are widely used to deal with unmeasured confounding in observational studies and imperfect randomized controlled trials. In these studies, researchers often target the so-called local average treatment effect as it is identifiable under mild conditions. In this paper, we consider estimation of the local average treatment effect under the binary instrumental variable model. We discuss the challenges for causal estimation with a binary outcome, and show that surprisingly, it can be more difficult than the case with a continuous outcome. We propose novel modeling and estimating procedures that improve upon existing proposals in terms of model congeniality, interpretability, robustness or efficiency. Our approach is illustrated via simulation studies and a real data analysis.
Instrumental variable methods can identify causal effects even when the treatment and outcome are confounded. We study the problem of imperfect measurements of the binary instrumental variable, treatment or outcome. We first consider non-differential measurement errors, that is, the mis-measured variable does not depend on other variables given its true value. We show that the measurement error of the instrumental variable does not bias the estimate, the measurement error of the treatment biases the estimate away from zero, and the measurement error of the outcome biases the estimate toward zero. Moreover, we derive sharp bounds on the causal effects without additional assumptions. These bounds are informative because they exclude zero. We then consider differential measurement errors, and focus on sensitivity analyses in those settings.
Missing data occur frequently in empirical studies in health and social sciences, often compromising our ability to make accurate inferences. An outcome is said to be missing not at random (MNAR) if, conditional on the observed variables, the missing data mechanism still depends on the unobserved outcome. In such settings, identification is generally not possible without imposing additional assumptions. Identification is sometimes possible, however, if an instrumental variable (IV) is observed for all subjects which satisfies the exclusion restriction that the IV affects the missingness process without directly influencing the outcome. In this paper, we provide necessary and sufficient conditions for nonparametric identification of the full data distribution under MNAR with the aid of an IV. In addition, we give sufficient identification conditions that are more straightforward to verify in practice. For inference, we focus on estimation of a population outcome mean, for which we develop a suite of semiparametric estimators that extend methods previously developed for data missing at random. Specifically, we propose inverse probability weighted estimation, outcome regression-based estimation and doubly robust estimation of the mean of an outcome subject to MNAR. For illustration, the methods are used to account for selection bias induced by HIV testing refusal in the evaluation of HIV seroprevalence in Mochudi, Botswana, using interviewer characteristics such as gender, age and years of experience as IVs.
This paper considers the instrumental variable quantile regression model (Chernozhukov and Hansen, 2005, 2013) with a binary endogenous treatment. It offers two identification results when the treatment status is not directly observed. The first result is that, remarkably, the reduced-form quantile regression of the outcome variable on the instrumental variable provides a lower bound on the structural quantile treatment effect under the stochastic monotonicity condition (Small and Tan, 2007; DiNardo and Lee, 2011). This result is relevant, not only when the treatment variable is subject to misclassification, but also when any measurement of the treatment variable is not available. The second result is for the structural quantile function when the treatment status is measured with error; I obtain the sharp identified set by deriving moment conditions under widely-used assumptions on the measurement error. Furthermore, I propose an inference method in the presence of other covariates.
Fan, Gijbels and King [Ann. Statist. 25 (1997) 1661--1690] considered the estimation of the risk function $psi (x)$ in the proportional hazards model. Their proposed estimator is based on integrating the estimated derivative function obtained through a local version of the partial likelihood. They proved the large sample properties of the derivative function, but the large sample properties of the estimator for the risk function itself were not established. In this paper, we consider direct estimation of the relative risk function $psi (x_2)-psi (x_1)$ for any location normalization point $x_1$. The main novelty in our approach is that we select observations in shrinking neighborhoods of both $x_1$ and $x_2$ when constructing a local version of the partial likelihood, whereas Fan, Gijbels and King [Ann. Statist. 25 (1997) 1661--1690] only concentrated on a single neighborhood, resulting in the cancellation of the risk function in the local likelihood function. The asymptotic properties of our estimator are rigorously established and the variance of the estimator is easily estimated. The idea behind our approach is extended to estimate the differences between groups. A simulation study is carried out.