No Arabic abstract
We derive new estimators of an optimal joint testing and treatment regime under the no direct effect (NDE) assumption that a given laboratory, diagnostic, or screening test has no effect on a patients clinical outcomes except through the effect of the test results on the choice of treatment. We model the optimal joint strategy using an optimal regime structural nested mean model (opt-SNMM). The proposed estimators are more efficient than previous estimators of the parameters of an opt-SNMM because they efficiently leverage the `no direct effect (NDE) of testing assumption. Our methods will be of importance to decision scientists who either perform cost-benefit analyses or are tasked with the estimation of the `value of information supplied by an expensive diagnostic test (such as an MRI to screen for lung cancer).
There is a fast-growing literature on estimating optimal treatment regimes based on randomized trials or observational studies under a key identifying condition of no unmeasured confounding. Because confounding by unmeasured factors cannot generally be ruled out with certainty in observational studies or randomized trials subject to noncompliance, we propose a general instrumental variable approach to learning optimal treatment regimes under endogeneity. Specifically, we establish identification of both value function $E[Y_{mathcal{D}(L)}]$ for a given regime $mathcal{D}$ and optimal regimes $text{argmax}_{mathcal{D}} E[Y_{mathcal{D}(L)}]$ with the aid of a binary instrumental variable, when no unmeasured confounding fails to hold. We also construct novel multiply robust classification-based estimators. Furthermore, we propose to identify and estimate optimal treatment regimes among those who would comply to the assigned treatment under a standard monotonicity assumption. In this latter case, we establish the somewhat surprising result that complier optimal regimes can be consistently estimated without directly collecting compliance information and therefore without the complier average treatment effect itself being identified. Our approach is illustrated via extensive simulation studies and a data application on the effect of child rearing on labor participation.
In many observational studies in social science and medical applications, subjects or individuals are connected, and one units treatment and attributes may affect another units treatment and outcome, violating the stable unit treatment value assumption (SUTVA) and resulting in interference. To enable feasible inference, many previous works assume the ``exchangeability of interfering units, under which the effect of interference is captured by the number or ratio of treated neighbors. However, in many applications with distinctive units, interference is heterogeneous. In this paper, we focus on the partial interference setting, and restrict units to be exchangeable conditional on observable characteristics. Under this framework, we propose generalized augmented inverse propensity weighted (AIPW) estimators for general causal estimands that include direct treatment effects and spillover effects. We show that they are consistent, asymptotically normal, semiparametric efficient, and robust to heterogeneous interference as well as model misspecifications. We also apply our method to the Add Health dataset and find that smoking behavior exhibits interference on academic outcomes.
Causal inference of treatment effects is a challenging undertaking in it of itself; inference for sequential treatments leads to even more hurdles. In precision medicine, one additional ambitious goal may be to infer about effects of dynamic treatment regimes (DTRs) and to identify optimal DTRs. Conventional methods for inferring about DTRs involve powerful semi-parametric estimators. However, these are not without their strong assumptions. Dynamic Marginal Structural Models (MSMs) are one semi-parametric approach used to infer about optimal DTRs in a family of regimes. To achieve this, investigators are forced to model the expected outcome under adherence to a DTR in the family; relatively straightforward models may lead to bias in the optimum. One way to obviate this difficulty is to perform a grid search for the optimal DTR. Unfortunately, this approach becomes prohibitive as the complexity of regimes considered increases. In recently developed Bayesian methods for dynamic MSMs, computational challenges may be compounded by the fact that at each grid point, a posterior mean must be calculated. We propose a manner by which to alleviate modelling difficulties for DTRs by using Gaussian process optimization. More precisely, we show how to pair this optimization approach with robust estimators for the causal effect of adherence to a DTR to identify optimal DTRs. We examine how to find the optimum in complex, multi-modal settings which are not generally addressed in the DTR literature. We further evaluate the sensitivity of the approach to a variety of modeling assumptions in the Gaussian process.
Currently, the high-precision estimation of nonlinear parameters such as Gini indices, low-income proportions or other measures of inequality is particularly crucial. In the present paper, we propose a general class of estimators for such parameters that take into account univariate auxiliary information assumed to be known for every unit in the population. Through a nonparametric model-assisted approach, we construct a unique system of survey weights that can be used to estimate any nonlinear parameter associated with any study variable of the survey, using a plug-in principle. Based on a rigorous functional approach and a linearization principle, the asymptotic variance of the proposed estimators is derived, and variance estimators are shown to be consistent under mild assumptions. The theory is fully detailed for penalized B-spline estimators together with suggestions for practical implementation and guidelines for choosing the smoothing parameters. The validity of the method is demonstrated on data extracted from the French Labor Force Survey. Point and confidence intervals estimation for the Gini index and the low-income proportion are derived. Theoretical and empirical results highlight our interest in using a nonparametric approach versus a parametric one when estimating nonlinear parameters in the presence of auxiliary information.
Causal effect estimation from observational data is an important but challenging problem. Causal effect estimation with unobserved variables in data is even more difficult. The challenges lie in (1) whether the causal effect can be estimated from observational data (identifiability); (2) accuracy of estimation (unbiasedness), and (3) fast data-driven algorithm for the estimation (efficiency). Each of the above problems by its own, is challenging. There does not exist many data-driven methods for causal effect estimation so far, and they solve one or two of the above problems, but not all. In this paper, we present an algorithm that is fast, unbiased and is able to confirm if a causal effect is identifiable or not under a very practical and commonly seen problem setting. To achieve high efficiency, we approach the causal effect estimation problem as a local search for the minimal adjustment variable sets in data. We have shown that identifiability and unbiased estimation can be both resolved using data in our problem setting, and we have developed theorems to support the local search for searching for adjustment variable sets to achieve unbiased causal effect estimation. We make use of frequent pattern mining strategy to further speed up the search process. Experiments performed on an extensive collection of synthetic and real-world datasets demonstrate that the proposed algorithm outperforms the state-of-the-art causal effect estimation methods in both accuracy and time-efficiency.