No Arabic abstract
Network meta-analysis (NMA) allows the combination of direct and indirect evidence from a set of randomized clinical trials. Performing NMA using individual patient data (IPD) is considered as a gold standard approach as it provides several advantages over NMA based on aggregate data. For example, it allows to perform advanced modelling of covariates or covariate-treatment interactions. An important issue in IPD NMA is the selection of influential parameters among terms that account for inconsistency, covariates, covariate-by-treatment interactions or non-proportionality of treatments effect for time to event data. This issue has not been deeply studied in the literature yet and in particular not for time-to-event data. A major difficulty is to jointly account for between-trial heterogeneity which could have a major influence on the selection process. The use of penalized generalized mixed effect model is a solution, but existing implementations have several shortcomings and an important computational cost that precludes their use for complex IPD NMA. In this article, we propose a penalized Poisson regression model to perform IPD NMA of time-to-event data. It is based only on fixed effect parameters which improve its computational cost over the use of random effects. It could be easily implemented using existing penalized regression package. Computer code is shared for implementation. The methods were applied on simulated data to illustrate the importance to take into account between trial heterogeneity during the selection procedure. Finally, it was applied to an IPD NMA of overall survival of chemotherapy and radiotherapy in nasopharyngeal carcinoma.
For survival data with high-dimensional covariates, results generated in the analysis of a single dataset are often unsatisfactory because of the small sample size. Integrative analysis pools raw data from multiple independent studies with comparable designs, effectively increases sample size, and has better performance than meta-analysis and single-dataset analysis. In this study, we conduct integrative analysis of survival data under the accelerated failure time (AFT) model. The sparsity structures of multiple datasets are described using the homogeneity and heterogeneity models. For variable selection under the homogeneity model, we adopt group penalization approaches. For variable selection under the heterogeneity model, we use composite penalization and sparse group penalization approaches. As a major advancement from the existing studies, the asymptotic selection and estimation properties are rigorously established. Simulation study is conducted to compare different penalization methods and against alternatives. We also analyze four lung cancer prognosis datasets with gene expression measurements.
Predicting risks of chronic diseases has become increasingly important in clinical practice. When a prediction model is developed in a given source cohort, there is often a great interest to apply the model to other cohorts. However, due to potential discrepancy in baseline disease incidences between different cohorts and shifts in patient composition, the risk predicted by the original model often under- or over-estimates the risk in the new cohort. The remedy of such a poorly calibrated prediction is needed for proper medical decision-making. In this article, we assume the relative risks of predictors are the same between the two cohorts, and propose a novel weighted estimating equation approach to re-calibrating the projected risk for the targeted population through updating the baseline risk. The recalibration leverages the knowledge about the overall survival probabilities for the disease of interest and competing events, and the summary information of risk factors from the targeted population. The proposed re-calibrated risk estimators gain efficiency if the risk factor distributions are the same for both the source and target cohorts, and are robust with little bias if they differ. We establish the consistency and asymptotic normality of the proposed estimators. Extensive simulation studies demonstrate that the proposed estimators perform very well in terms of robustness and efficiency in finite samples. A real data application to colorectal cancer risk prediction also illustrates that the proposed method can be used in practice for model recalibration.
In a network meta-analysis, some of the collected studies may deviate markedly from the others, for example having very unusual effect sizes. These deviating studies can be regarded as outlying with respect to the rest of the network and can be influential on the pooled results. Thus, it could be inappropriate to synthesize those studies without further investigation. In this paper, we propose two Bayesian methods to detect outliers in a network meta-analysis via: (a) a mean-shifted outlier model and (b), posterior predictive p-values constructed from ad-hoc discrepancy measures. The former method uses Bayes factors to formally test each study against outliers while the latter provides a score of outlyingness for each study in the network, which allows to numerically quantify the uncertainty associated with being outlier. Furthermore, we present a simple method based on informative priors as part of the network meta-analysis model to down-weight the detected outliers. We conduct extensive simulations to evaluate the effectiveness of the proposed methodology while comparing it to some alternative, available outlier diagnostic tools. Two real networks of interventions are then used to demonstrate our methods in practice.
The penalized Cox proportional hazard model is a popular analytical approach for survival data with a large number of covariates. Such problems are especially challenging when covariates vary over follow-up time (i.e., the covariates are time-dependent). The standard R packages for fully penalized Cox models cannot currently incorporate time-dependent covariates. To address this gap, we implement a variant of gradient descent algorithm (proximal gradient descent) for fitting penalized Cox models. We apply our implementation to real and simulated data sets.
As an effective nonparametric method, empirical likelihood (EL) is appealing in combining estimating equations flexibly and adaptively for incorporating data information. To select important variables and estimating equations in the sparse high-dimensional model, we consider a penalized EL method based on robust estimating functions by applying two penalty functions for regularizing the regression parameters and the associated Lagrange multipliers simultaneously, which allows the dimensionalities of both regression parameters and estimating equations to grow exponentially with the sample size. A first inspection on the robustness of estimating equations contributing to the estimating equations selection and variable selection is discussed from both theoretical perspective and intuitive simulation results in this paper. The proposed method can improve the robustness and effectiveness when the data have underlying outliers or heavy tails in the response variables and/or covariates. The robustness of the estimator is measured via the bounded influence function, and the oracle properties are also established under some regularity conditions. Extensive simulation studies and a yeast cell data are used to evaluate the performance of the proposed method. The numerical results reveal that the robustness of sparse estimating equations selection fundamentally enhances variable selection accuracy when the data have heavy tails and/or include underlying outliers.