No Arabic abstract
Response adaptive randomization is appealing in confirmatory adaptive clinical trials from statistical, ethical, and pragmatic perspectives, in the sense that subjects are more likely to be randomized to better performing treatment groups based on accumulating data. The Doubly Adaptive Biased Coin Design (DBCD) is a popular solution due to its asymptotic normal property of final allocations, which further justifies its asymptotic type I error rate control. As an alternative, we propose a Response Adaptive Block Randomization (RABR) design with pre-specified randomization ratios for the control and high-performing groups to robustly achieve desired final sample size per group under different underlying responses, which is usually required in industry-sponsored clinical studies. We show that the usual test statistic has a controlled type I error rate. Our simulations further highlight the advantages of the proposed design over the DBCD in terms of consistently achieving final sample allocations and of power performance. We further apply this design to a Phase III study evaluating the efficacy of two dosing regimens of adjunctive everolimus in treating tuberous sclerosis complex but with no previous dose-finding studies in this indication.
Response-adaptive randomization (RAR) is part of a wider class of data-dependent sampling algorithms, for which clinical trials are used as a motivating application. In that context, patient allocation to treatments is determined by randomization probabilities that are altered based on the accrued response data in order to achieve experimental goals. RAR has received abundant theoretical attention from the biostatistical literature since the 1930s and has been the subject of numerous debates. In the last decade, it has received renewed consideration from the applied and methodological communities, driven by successful practical examples and its widespread use in machine learning. Papers on the subject can give different views on its usefulness, and reconciling these may be difficult. This work aims to address this gap by providing a unified, broad and up-to-date review of methodological and practical issues to consider when debating the use of RAR in clinical trials.
One central goal of design of observational studies is to embed non-experimental data into an approximate randomized controlled trial using statistical matching. Researchers then make the randomization assumption in their downstream, outcome analysis. For matched pair design, the randomization assumption states that the treatment assignment across all matched pairs are independent, and that the probability of the first subject in each pair receiving treatment and the other control is the same as the first receiving control and the other treatment. In this article, we develop a novel framework for testing the randomization assumption based on solving a clustering problem with side-information using modern statistical learning tools. Our testing framework is nonparametric, finite-sample exact, and distinct from previous proposals in that it can be used to test a relaxed version of the randomization assumption called the biased randomization assumption. One important by-product of our testing framework is a quantity called residual sensitivity value (RSV), which quantifies the level of minimal residual confounding due to observed covariates not being well matched. We advocate taking into account RSV in the downstream primary analysis. The proposed methodology is illustrated by re-examining a famous observational study concerning the effect of right heart catheterization (RHC) in the initial care of critically ill patients.
Suppose an online platform wants to compare a treatment and control policy, e.g., two different matching algorithms in a ridesharing system, or two different inventory management algorithms in an online retail site. Standard randomized controlled trials are typically not feasible, since the goal is to estimate policy performance on the entire system. Instead, the typical current practice involves dynamically alternating between the two policies for fixed lengths of time, and comparing the average performance of each over the intervals in which they were run as an estimate of the treatment effect. However, this approach suffers from *temporal interference*: one algorithm alters the state of the system as seen by the second algorithm, biasing estimates of the treatment effect. Further, the simple non-adaptive nature of such designs implies they are not sample efficient. We develop a benchmark theoretical model in which to study optimal experimental design for this setting. We view testing the two policies as the problem of estimating the steady state difference in reward between two unknown Markov chains (i.e., policies). We assume estimation of the steady state reward for each chain proceeds via nonparametric maximum likelihood, and search for consistent (i.e., asymptotically unbiased) experimental designs that are efficient (i.e., asymptotically minimum variance). Characterizing such designs is equivalent to a Markov decision problem with a minimum variance objective; such problems generally do not admit tractable solutions. Remarkably, in our setting, using a novel application of classical martingale analysis of Markov chains via Poissons equation, we characterize efficient designs via a succinct convex optimization problem. We use this characterization to propose a consistent, efficient online experimental design that adaptively samples the two Markov chains.
In this paper, we study the estimation and inference of the quantile treatment effect under covariate-adaptive randomization. We propose two estimation methods: (1) the simple quantile regression and (2) the inverse propensity score weighted quantile regression. For the two estimators, we derive their asymptotic distributions uniformly over a compact set of quantile indexes, and show that, when the treatment assignment rule does not achieve strong balance, the inverse propensity score weighted estimator has a smaller asymptotic variance than the simple quantile regression estimator. For the inference of method (1), we show that the Wald test using a weighted bootstrap standard error under-rejects. But for method (2), its asymptotic size equals the nominal level. We also show that, for both methods, the asymptotic size of the Wald test using a covariate-adaptive bootstrap standard error equals the nominal level. We illustrate the finite sample performance of the new estimation and inference methods using both simulated and real datasets.
Covariate-adaptive randomization schemes such as the minimization and stratified permuted blocks are often applied in clinical trials to balance treatment assignments across prognostic factors. The existing theoretical developments on inference after covariate-adaptive randomization are mostly limited to situations where a correct model between the response and covariates can be specified or the randomization method has well-understood properties. Based on stratification with covariate levels utilized in randomization and a further adjusting for covariates not used in randomization, in this article we propose several estimators for model free inference on average treatment effect defined as the difference between response means under two treatments. We establish asymptotic normality of the proposed estimators under all popular covariate-adaptive randomization schemes including the minimization whose theoretical property is unclear, and we show that the asymptotic distributions are invariant with respect to covariate-adaptive randomization methods. Consistent variance estimators are constructed for asymptotic inference. Asymptotic relative efficiencies and finite sample properties of estimators are also studied. We recommend using one of our proposed estimators for valid and model free inference after covariate-adaptive randomization.