No Arabic abstract
We argue that randomized controlled trials (RCTs) are special even among settings where average treatment effects are identified by a nonparametric unconfoundedness assumption. This claim follows from two results of Robins and Ritov (1997): (1) with at least one continuous covariate control, no estimator of the average treatment effect exists which is uniformly consistent without further assumptions, (2) knowledge of the propensity score yields a consistent estimator and confidence intervals at parametric rates, regardless of how complicated the propensity score function is. We emphasize the latter point, and note that successfully-conducted RCTs provide knowledge of the propensity score to the researcher. We discuss modern developments in covariate adjustment for RCTs, noting that statistical models and machine learning methods can be used to improve efficiency while preserving finite sample unbiasedness. We conclude that statistical inference has the potential to be fundamentally more difficult in observational settings than it is in RCTs, even when all confounders are measured.
Covariate adjustment is an important tool in the analysis of randomized clinical trials and observational studies. It can be used to increase efficiency and thus power, and to reduce possible bias. While most statistical tests in randomized clinical trials are nonparametric in nature, approaches for covariate adjustment typically rely on specific regression models, such as the linear model for a continuous outcome, the logistic regression model for a dichotomous outcome and the Cox model for survival time. Several recent efforts have focused on model-free covariate adjustment. This paper makes use of the empirical likelihood method and proposes a nonparametric approach to covariate adjustment. A major advantage of the new approach is that it automatically utilizes covariate information in an optimal way without fitting nonparametric regression. The usual asymptotic properties, including the Wilks-type result of convergence to a chi-square distribution for the empirical likelihood ratio based test, and asymptotic normality for the corresponding maximum empirical likelihood estimator, are established. It is also shown that the resulting test is asymptotically most powerful and that the estimator for the treatment effect achieves the semiparametric efficiency bound. The new method is applied to the Global Use of Strategies to Open Occluded Coronary Arteries (GUSTO)-I trial. Extensive simulations are conducted, validating the theoretical findings.
Cluster randomized controlled trials (cRCTs) are designed to evaluate interventions delivered to groups of individuals. A practical limitation of such designs is that the number of available clusters may be small, resulting in an increased risk of baseline imbalance under simple randomization. Constrained randomization overcomes this issue by restricting the allocation to a subset of randomization schemes where sufficient overall covariate balance across comparison arms is achieved with respect to a pre-specified balance metric. However, several aspects of constrained randomization for the design and analysis of multi-arm cRCTs have not been fully investigated. Motivated by an ongoing multi-arm cRCT, we provide a comprehensive evaluation of the statistical properties of model-based and randomization-based tests under both simple and constrained randomization designs in multi-arm cRCTs, with varying combinations of design and analysis-based covariate adjustment strategies. In particular, as randomization-based tests have not been extensively studied in multi-arm cRCTs, we additionally develop most-powerful permutation tests under the linear mixed model framework for our comparisons. Our results indicate that under constrained randomization, both model-based and randomization-based analyses could gain power while preserving nominal type I error rate, given proper analysis-based adjustment for the baseline covariates. The choice of balance metrics and candidate set size and their implications on the testing of the pairwise and global hypotheses are also discussed. Finally, we caution against the design and analysis of multi-arm cRCTs with an extremely small number of clusters, due to insufficient degrees of freedom and the tendency to obtain an overly restricted randomization space.
We apply the pigeonhole principle to show that there must exist Boolean functions on 7 inputs with a multiplicative complexity of at least 7, i.e., that cannot be computed with only 6 multiplications in the Galois field with two elements.
Cluster randomized trials (CRTs) are popular in public health and in the social sciences to evaluate a new treatment or policy where the new policy is randomly allocated to clusters of units rather than individual units. CRTs often feature both noncompliance, when individuals within a cluster are not exposed to the intervention, and individuals within a cluster may influence each other through treatment spillovers where those who comply with the new policy may affect the outcomes of those who do not. Here, we study the identification of causal effects in CRTs when both noncompliance and treatment spillovers are present. We prove that the standard analysis of CRT data with noncompliance using instrumental variables does not identify the usual complier average causal effect when treatment spillovers are present. We extend this result and show that no analysis of CRT data can unbiasedly estimate local network causal effects. Finally, we develop bounds for these causal effects under the assumption that the treatment is not harmful compared to the control. We demonstrate these results with an empirical study of a deworming intervention in Kenya.
Scharfstein et al. (2021) developed a sensitivity analysis model for analyzing randomized trials with repeatedly measured binary outcomes that are subject to nonmonotone missingness. Their approach becomes computationally intractable when the number of repeated measured is large (e.g., greater than 15). In this paper, we repair this problem by introducing an $m$th-order Markovian restriction. We establish an identification by representing the model as a directed acyclic graph (DAG). We illustrate our methodology in the context of a randomized trial designed to evaluate a web-delivered psychosocial intervention to reduce substance use, assessed by testing urine samples twice weekly for 12 weeks, among patients entering outpatient addiction treatment. We evaluate the finite sample properties of our method in a realistic simulation study. Our methods have been integrated into the R package entitled slabm.