No Arabic abstract
Most clinical trials involve the comparison of a new treatment to a control arm (e.g., the standard of care) and the estimation of a treatment effect. External data, including historical clinical trial data and real-world observational data, are commonly available for the control arm. Borrowing information from external data holds the promise of improving the estimation of relevant parameters and increasing the power of detecting a treatment effect if it exists. In this paper, we propose to use Bayesian additive regression trees (BART) for incorporating external data into the analysis of clinical trials, with a specific goal of estimating the conditional or population average treatment effect. BART naturally adjusts for patient-level covariates and captures potentially heterogeneous treatment effects across different data sources, achieving flexible borrowing. Simulation studies demonstrate that BART compares favorably to a hierarchical linear model and a normal-normal hierarchical model. We illustrate the proposed method with an acupuncture trial.
We develop a Bayesian sum-of-trees model where each tree is constrained by a regularization prior to be a weak learner, and fitting and inference are accomplished via an iterative Bayesian backfitting MCMC algorithm that generates samples from a posterior. Effectively, BART is a nonparametric Bayesian regression approach which uses dimensionally adaptive random basis elements. Motivated by ensemble methods in general, and boosting algorithms in particular, BART is defined by a statistical model: a prior and a likelihood. This approach enables full posterior inference including point and interval estimates of the unknown regression function as well as the marginal effects of potential predictors. By keeping track of predictor inclusion frequencies, BART can also be used for model-free variable selection. BARTs many features are illustrated with a bake-off against competing methods on 42 different data sets, with a simulation experiment and on a drug discovery classification problem.
In many longitudinal studies, the covariate and response are often intermittently observed at irregular, mismatched and subject-specific times. How to deal with such data when covariate and response are observed asynchronously is an often raised problem. Bayesian Additive Regression Trees(BART) is a Bayesian non-Parametric approach which has been shown to be competitive with the best modern predictive methods such as random forest and boosted decision trees. The sum of trees structure combined with a Bayesian inferential framework provide a accurate and robust statistic method. BART variant soft Bayesian Additive Regression Trees(SBART) constructed using randomized decision trees was developed and substantial theoretical and practical benefits were shown. In this paper, we propose a weighted SBART model solution for asynchronous longitudinal data. In comparison to other methods, the current methods are valid under with little assumptions on the covariate process. Extensive simulation studies provide numerical support for this solution. And data from an HIV study is used to illustrate our methodology
Many time-to-event studies are complicated by the presence of competing risks. Such data are often analyzed using Cox models for the cause specific hazard function or Fine-Gray models for the subdistribution hazard. In practice regression relationships in competing risks data with either strategy are often complex and may include nonlinear functions of covariates, interactions, high-dimensional parameter spaces and nonproportional cause specific or subdistribution hazards. Model misspecification can lead to poor predictive performance. To address these issues, we propose a novel approach to flexible prediction modeling of competing risks data using Bayesian Additive Regression Trees (BART). We study the simulation performance in two-sample scenarios as well as a complex regression setting, and benchmark its performance against standard regression techniques as well as random survival forests. We illustrate the use of the proposed method on a recently published study of patients undergoing hematopoietic stem cell transplantation.
Incorporating preclinical animal data, which can be regarded as a special kind of historical data, into phase I clinical trials can improve decision making when very little about human toxicity is known. In this paper, we develop a robust hierarchical modelling approach to leverage animal data into new phase I clinical trials, where we bridge across non-overlapping, potentially heterogeneous patient subgroups. Translation parameters are used to bring both historical and contemporary data onto a common dosing scale. This leads to feasible exchangeability assumptions that the parameter vectors, which underpin the dose-toxicity relationship per study, are assumed to be drawn from a common distribution. Moreover, human dose-toxicity parameter vectors are assumed to be exchangeable either with the standardised, animal study-specific parameter vectors, or between themselves. Possibility of non-exchangeability for each parameter vector is considered to avoid inferences for extreme subgroups being overly influenced by the other. We illustrate the proposed approach with several trial data examples, and evaluate the operating characteristics of our model compared with several alternatives in a simulation study. Numerical results show that our approach yields robust inferences in circumstances, where data from multiple sources are inconsistent and/or the bridging assumptions are incorrect.
A population-averaged additive subdistribution hazard model is proposed to assess the marginal effects of covariates on the cumulative incidence function to analyze correlated failure time data subject to competing risks. This approach extends the population-averaged additive hazard model by accommodating potentially dependent censoring due to competing events other than the event of interest. Assuming an independent working correlation structure, an estimating equations approach is considered to estimate the regression coefficients and a sandwich variance estimator is proposed. The sandwich variance estimator accounts for both the correlations between failure times as well as the those between the censoring times, and is robust to misspecification of the unknown dependency structure within each cluster. We further develop goodness-of-fit tests to assess the adequacy of the additive structure of the subdistribution hazard for each covariate, as well as for the overall model. Simulation studies are carried out to investigate the performance of the proposed methods in finite samples; and we illustrate our methods by analyzing the STrategies to Reduce Injuries and Develop confidence in Elders (STRIDE) study.