ترغب بنشر مسار تعليمي؟ اضغط هنا

Identification of Causal Effects Within Principal Strata Using Auxiliary Variables

159   0   0.0 ( 0 )
 نشر من قبل Zhichao Jiang
 تاريخ النشر 2020
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

In causal inference, principal stratification is a framework for dealing with a posttreatment intermediate variable between a treatment and an outcome, in which the principal strata are defined by the joint potential values of the intermediate variable. Because the principal strata are not fully observable, the causal effects within them, also known as the principal causal effects, are not identifiable without additional assumptions. Several previous empirical studies leveraged auxiliary variables to improve the inference of principal causal effects. We establish a general theory for identification and estimation of the principal causal effects with auxiliary variables, which provides a solid foundation for statistical inference and more insights for model building in empirical research. In particular, we consider two commonly-used strategies for principal stratification problems: principal ignorability, and the conditional independence between the auxiliary variable and the outcome given principal strata and covariates. For these two strategies, we give non-parametric and semi-parametric identification results without modeling assumptions on the outcome. When the assumptions for neither strategies are plausible, we propose a large class of flexible parametric and semi-parametric models for identifying principal causal effects. Our theory not only establishes formal identification results of several models that have been used in previous empirical studies but also generalizes them to allow for different types of outcomes and intermediate variables.



قيم البحث

اقرأ أيضاً

We developed a novel approach to identification and model testing in linear structural equation models (SEMs) based on auxiliary variables (AVs), which generalizes a widely-used family of methods known as instrumental variables. The identification pr oblem is concerned with the conditions under which causal parameters can be uniquely estimated from an observational, non-causal covariance matrix. In this paper, we provide an algorithm for the identification of causal parameters in linear structural models that subsumes previous state-of-the-art methods. In other words, our algorithm identifies strictly more coefficients and models than methods previously known in the literature. Our algorithm builds on a graph-theoretic characterization of conditional independence relations between auxiliary and model variables, which is developed in this paper. Further, we leverage this new characterization for allowing identification when limited experimental data or new substantive knowledge about the domain is available. Lastly, we develop a new procedure for model testing using AVs.
Causal inference concerns not only the average effect of the treatment on the outcome but also the underlying mechanism through an intermediate variable of interest. Principal stratification characterizes such mechanism by targeting subgroup causal e ffects within principal strata, which are defined by the joint potential values of an intermediate variable. Due to the fundamental problem of causal inference, principal strata are inherently latent, rendering it challenging to identify and estimate subgroup effects within them. A line of research leverages the principal ignorability assumption that the latent principal strata are mean independent of the potential outcomes conditioning on the observed covariates. Under principal ignorability, we derive various nonparametric identification formulas for causal effects within principal strata in observational studies, which motivate estimators relying on the correct specifications of different parts of the observed-data distribution. Appropriately combining these estimators further yields new triply robust estimators for the causal effects within principal strata. These new estimators are consistent if two of the treatment, intermediate variable, and outcome models are correctly specified, and they are locally efficient if all three models are correctly specified. We show that these estimators arise naturally from either the efficient influence functions in the semiparametric theory or the model-assisted estimators in the survey sampling theory. We evaluate different estimators based on their finite-sample performance through simulation, apply them to two observational studies, and implement them in an open-source software package.
155 - Naoki Egami 2018
Although social and biomedical scientists have long been interested in the process through which ideas and behaviors diffuse, the identification of causal diffusion effects, also known as peer and contagion effects, remains challenging. Many scholars consider the commonly used assumption of no omitted confounders to be untenable due to contextual confounding and homophily bias. To address this long-standing problem, we examine the causal identification under a new assumption of structural stationarity, which formalizes the underlying diffusion process with a class of dynamic causal directed acyclic graphs. First, we develop a statistical test that can detect a wide range of biases, including the two types mentioned above. We then propose a difference-in-differences style estimator that can directly correct biases under an additional parametric assumption. Leveraging the proposed methods, we study the spatial diffusion of hate crimes against refugees in Germany. After correcting large upward bias in existing studies, we find hate crimes diffuse only to areas that have a high proportion of school dropouts.
171 - H. Cardot , C. Goga , M.-A Shehzad 2014
In survey sampling, calibration is a very popular tool used to make total estimators consistent with known totals of auxiliary variables and to reduce variance. When the number of auxiliary variables is large, calibration on all the variables may lea d to estimators of totals whose mean squared error (MSE) is larger than the MSE of the Horvitz-Thompson estimator even if this simple estimator does not take account of the available auxiliary information. We study in this paper a new technique based on dimension reduction through principal components that can be useful in this large dimension context. Calibration is performed on the first principal components, which can be viewed as the synthetic variables containing the most important part of the variability of the auxiliary variables. When some auxiliary variables play a more important role than the others, the method can be adapted to provide an exact calibration on these important variables. Some asymptotic properties are given in which the number of variables is allowed to tend to infinity with the population size. A data driven selection criterion of the number of principal components ensuring that all the sampling weights remain positive is discussed. The methodology of the paper is illustrated, in a multipurpose context, by an application to the estimation of electricity consumption for each day of a week with the help of 336 auxiliary variables consisting of the past consumption measured every half an hour over the previous week.
In this paper, we extend graph-based identification methods by allowing background knowledge in the form of non-zero parameter values. Such information could be obtained, for example, from a previously conducted randomized experiment, from substantiv e understanding of the domain, or even an identification technique. To incorporate such information systematically, we propose the addition of auxiliary variables to the model, which are constructed so that certain paths will be conveniently cancelled. This cancellation allows the auxiliary variables to help conventional methods of identification (e.g., single-door criterion, instrumental variables, half-trek criterion), as well as model testing (e.g., d-separation, over-identification). Moreover, by iteratively alternating steps of identification and adding auxiliary variables, we can improve the power of existing identification methods via a bootstrapping approach that does not require external knowledge. We operationalize this method for simple instrumental sets (a generalization of instrumental variables) and show that the resulting method is able to identify at least as many models as the most general identification method for linear systems known to date. We further discuss the application of auxiliary variables to the tasks of model testing and z-identification.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا