ترغب بنشر مسار تعليمي؟ اضغط هنا

Causal inference methods for combining randomized trials and observational studies: a review

106   0   0.0 ( 0 )
 نشر من قبل B\\'en\\'edicte Colnet
 تاريخ النشر 2020
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

With increasing data availability, causal treatment effects can be evaluated across different datasets, both randomized controlled trials (RCTs) and observational studies. RCTs isolate the effect of the treatment from that of unwanted (confounding) co-occurring effects. But they may struggle with inclusion biases, and thus lack external validity. On the other hand, large observational samples are often more representative of the target population but can conflate confounding effects with the treatment of interest. In this paper, we review the growing literature on methods for causal inference on combined RCTs and observational studies, striving for the best of both worlds. We first discuss identification and estimation methods that improve generalizability of RCTs using the representativeness of observational data. Classical estimators include weighting, difference between conditional outcome models, and doubly robust estimators. We then discuss methods that combine RCTs and observational data to improve (conditional) average treatment effect estimation, handling possible unmeasured confounding in the observational data. We also connect and contrast works developed in both the potential outcomes framework and the structural causal model framework. Finally, we compare the main methods using a simulation study and real world data to analyze the effect of tranexamic acid on the mortality rate in major trauma patients. Code to implement many of the methods is provided.



قيم البحث

اقرأ أيضاً

The ICH E9 addendum introduces the term intercurrent event to refer to events that happen after randomisation and that can either preclude observation of the outcome of interest or affect its interpretation. It proposes five strategies for handling i ntercurrent events to form an estimand but does not suggest statistical methods for estimation. In this paper we focus on the hypothetical strategy, where the treatment effect is defined under the hypothetical scenario in which the intercurrent event is prevented. For its estimation, we consider causal inference and missing data methods. We establish that certain causal inference estimators are identical to certain missing data estimators. These links may help those familiar with one set of methods but not the other. Moreover, using potential outcome notation allows us to state more clearly the assumptions on which missing data methods rely to estimate hypothetical estimands. This helps to indicate whether estimating a hypothetical estimand is reasonable, and what data should be used in the analysis. We show that hypothetical estimands can be estimated by exploiting data after intercurrent event occurrence, which is typically not used. We also present Monte Carlo simulations that illustrate the implementation and performance of the methods in different settings.
In a comprehensive cohort study of two competing treatments (say, A and B), clinically eligible individuals are first asked to enroll in a randomized trial and, if they refuse, are then asked to enroll in a parallel observational study in which they can choose treatment according to their own preference. We consider estimation of two estimands: (1) comprehensive cohort causal effect -- the difference in mean potential outcomes had all patients in the comprehensive cohort received treatment A vs. treatment B and (2) randomized trial causal effect -- the difference in mean potential outcomes had all patients enrolled in the randomized trial received treatment A vs. treatment B. For each estimand, we consider inference under various sets of unconfoundedness assumptions and construct semiparametric efficient and robust estimators. These estimators depend on nuisance functions, which we estimate, for illustrative purposes, using generalized additive models. Using the theory of sample splitting, we establish the asymptotic properties of our proposed estimators. We also illustrate our methodology using data from the Bypass Angioplasty Revascularization Investigation (BARI) randomized trial and observational registry to evaluate the effect of percutaneous transluminal coronary balloon angioplasty versus coronary artery bypass grafting on 5-year mortality. To evaluate the finite sample performance of our estimators, we use the BARI dataset as the basis of a realistic simulation study.
Cluster randomized controlled trials (cRCTs) are designed to evaluate interventions delivered to groups of individuals. A practical limitation of such designs is that the number of available clusters may be small, resulting in an increased risk of ba seline imbalance under simple randomization. Constrained randomization overcomes this issue by restricting the allocation to a subset of randomization schemes where sufficient overall covariate balance across comparison arms is achieved with respect to a pre-specified balance metric. However, several aspects of constrained randomization for the design and analysis of multi-arm cRCTs have not been fully investigated. Motivated by an ongoing multi-arm cRCT, we provide a comprehensive evaluation of the statistical properties of model-based and randomization-based tests under both simple and constrained randomization designs in multi-arm cRCTs, with varying combinations of design and analysis-based covariate adjustment strategies. In particular, as randomization-based tests have not been extensively studied in multi-arm cRCTs, we additionally develop most-powerful permutation tests under the linear mixed model framework for our comparisons. Our results indicate that under constrained randomization, both model-based and randomization-based analyses could gain power while preserving nominal type I error rate, given proper analysis-based adjustment for the baseline covariates. The choice of balance metrics and candidate set size and their implications on the testing of the pairwise and global hypotheses are also discussed. Finally, we caution against the design and analysis of multi-arm cRCTs with an extremely small number of clusters, due to insufficient degrees of freedom and the tendency to obtain an overly restricted randomization space.
196 - Andrew Ying , Wang Miao , Xu Shi 2021
A standard assumption for causal inference about the joint effects of time-varying treatment is that one has measured sufficient covariates to ensure that within covariate strata, subjects are exchangeable across observed treatment values, also known as sequential randomization assumption (SRA). SRA is often criticized as it requires one to accurately measure all confounders. Realistically, measured covariates can rarely capture all confounders with certainty. Often covariate measurements are at best proxies of confounders, thus invalidating inferences under SRA. In this paper, we extend the proximal causal inference (PCI) framework of Miao et al. (2018) to the longitudinal setting under a semiparametric marginal structural mean model (MSMM). PCI offers an opportunity to learn about joint causal effects in settings where SRA based on measured time-varying covariates fails, by formally accounting for the covariate measurements as imperfect proxies of underlying confounding mechanisms. We establish nonparametric identification with a pair of time-varying proxies and provide a corresponding characterization of regular and asymptotically linear estimators of the parameter indexing the MSMM, including a rich class of doubly robust estimators, and establish the corresponding semiparametric efficiency bound for the MSMM. Extensive simulation studies and a data application illustrate the finite sample behavior of proposed methods.
136 - Hyunseung Kang , Luke Keele 2018
Many policy evaluations occur in settings where treatment is randomized at the cluster level, and there is treatment noncompliance within each cluster. For example, villages might be assigned to treatment and control, but residents in each village ma y choose to comply or not with their assigned treatment status. When noncompliance is present, the instrumental variables framework can be used to identify and estimate causal effects. While a large literature exists on instrumental variables estimation methods, relatively little work has been focused on settings with clustered treatments. Here, we review extant methods for instrumental variable estimation in clustered designs and derive both the finite and asymptotic properties of these estimators. We prove that the properties of current estimators depend on unrealistic assumptions. We then develop a new IV estimation method for cluster randomized trials and study its formal properties. We prove that our IV estimator allows for possible treatment effect heterogeneity that is correlated with cluster size and is robust to low compliance rates within clusters. We evaluate these methods using simulations and apply them to data from a randomized intervention in India.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا