No Arabic abstract
Finding translational biomarkers stands center stage of the future of personalized medicine in healthcare. We observed notable challenges in identifying robust biomarkers as some with great performance in one scenario often fail to perform well in new trials (e.g. different population, indications). With rapid development in the clinical trial world (e.g. assay, disease definition), new trials very likely differ from legacy ones in many perspectives and in development of biomarkers this heterogeneity should be considered. In response, we recommend considering building in the heterogeneity when evaluating biomarkers. In this paper, we present one evaluation strategy by using leave-one-study-out (LOSO) in place of conventional cross-validation (cv) methods to account for the potential heterogeneity across trials used for building and testing the biomarkers. To demonstrate the performance of K-fold vs LOSO cv in estimating the effect size of biomarkers, we leveraged data from clinical trials and simulation studies. In our assessment, LOSO cv provided a more objective estimate of the future performance. This conclusion remained true across different evaluation metrics and different statistical methods.
Just-in-time adaptive interventions (JITAIs) are time-varying adaptive interventions that use frequent opportunities for the intervention to be adapted--weekly, daily, or even many times a day. The micro-randomized trial (MRT) has emerged for use in informing the construction of JITAIs. MRTs can be used to address research questions about whether and under what circumstances JITAI components are effective, with the ultimate objective of developing effective and efficient JITAI. The purpose of this article is to clarify why, when, and how to use MRTs; to highlight elements that must be considered when designing and implementing an MRT; and to review primary and secondary analyses methods for MRTs. We briefly review key elements of JITAIs and discuss a variety of considerations that go into planning and designing an MRT. We provide a definition of causal excursion effects suitable for use in primary and secondary analyses of MRT data to inform JITAI development. We review the weighted and centered least-squares (WCLS) estimator which provides consistent causal excursion effect estimators from MRT data. We describe how the WCLS estimator along with associated test statistics can be obtained using standard statistical software such as R (R Core Team, 2019). Throughout we illustrate the MRT design and analyses using the HeartSteps MRT, for developing a JITAI to increase physical activity among sedentary individuals. We supplement the HeartSteps MRT with two other MRTs, SARA and BariFit, each of which highlights different research questions that can be addressed using the MRT and experimental design considerations that might arise.
Developing spatio-temporal crime prediction models, and to a lesser extent, developing measures of accuracy and operational efficiency for them, has been an active area of research for almost two decades. Despite calls for rigorous and independent evaluations of model performance, such studies have been few and far between. In this paper, we argue that studies should focus not on finding the one predictive model or the one measure that is the most appropriate at all times, but instead on careful consideration of several factors that affect the choice of the model and the choice of the measure, to find the best measure and the best model for the problem at hand. We argue that because each problem is unique, it is important to develop measures that empower the practitioner with the ability to input the choices and preferences that are most appropriate for the problem at hand. We develop a new measure called the penalized predictive accuracy index (PPAI) which imparts such flexibility. We also propose the use of the expected utility function to combine multiple measures in a way that is appropriate for a given problem in order to assess the models against multiple criteria. We further propose the use of the average logarithmic score (ALS) measure that is appropriate for many crime models and measures accuracy differently than existing measures. These measures can be used alongside existing measures to provide a more comprehensive means of assessing the accuracy and potential utility of spatio-temporal crime prediction models.
Is it possible for a large sequence of measurements or observations, which support a hypothesis, to counterintuitively decrease our confidence? Can unanimous support be too good to be true? The assumption of independence is often made in good faith, however rarely is consideration given to whether a systemic failure has occurred. Taking this into account can cause certainty in a hypothesis to decrease as the evidence for it becomes apparently stronger. We perform a probabilistic Bayesian analysis of this effect with examples based on (i) archaeological evidence, (ii) weighing of legal evidence, and (iii) cryptographic primality testing. We find that even with surprisingly low systemic failure rates high confidence is very difficult to achieve and in particular we find that certain analyses of cryptographically-important numerical tests are highly optimistic, underestimating their false-negative rate by as much as a factor of $2^{80}$.
Optimization algorithms and Monte Carlo sampling algorithms have provided the computational foundations for the rapid growth in applications of statistical machine learning in recent years. There is, however, limited theoretical understanding of the relationships between these two kinds of methodology, and limited understanding of relative strengths and weaknesses. Moreover, existing results have been obtained primarily in the setting of convex functions (for optimization) and log-concave functions (for sampling). In this setting, where local properties determine global properties, optimization algorithms are unsurprisingly more efficient computationally than sampling algorithms. We instead examine a class of nonconvex objective functions that arise in mixture modeling and multi-stable systems. In this nonconvex setting, we find that the computational complexity of sampling algorithms scales linearly with the model dimension while that of optimization algorithms scales exponentially.
Background: All states in the US have enacted at least some naloxone access laws (NALs) in an effort to reduce opioid overdose lethality. Previous evaluations found NALs increased naloxone dispensing but showed mixed results in terms of opioid overdose mortality. One reason for mixed results could be failure to address violations of the positivity assumption caused by the co-occurrence of NAL enactment with enactment of related laws, ultimately resulting in bias, increased variance, and low statistical power. Methods: We reformulated the research question to alleviate some challenges related to law co-occurrence. Because NAL enactment was closely correlated with Good Samaritan Law (GSL) enactment, we bundled NAL with GSL, and estimated the hypothetical associations of enacting NAL/GSL up to 2 years earlier (an amount supported by the observed data) on naloxone dispensation and opioid overdose mortality. Results: We estimated that such a shift in NAL/GSL duration would have been associated with increased naloxone dispensations (0.28 dispensations/100,000 people (95% CI: 0.18-0.38) in 2013 among early NAL/GSL enactors; 47.58 (95% CI: 28.40-66.76) in 2018 among late enactors). We estimated that such a shift would have been associated with increased opioid overdose mortality (1.88 deaths/100,000 people (95% CI: 1.03-2.72) in 2013; 2.06 (95% CI: 0.92-3.21) in 2018). Conclusions: Consistent with prior research, increased duration of NAL/GSL enactment was associated with increased naloxone dispensing. Contrary to expectation, we did not find a protective association with opioid overdose morality, though residual bias due to unobserved confounding and interference likely remain.