No Arabic abstract
With the resurgence of tick-borne diseases such as Lyme disease and the emergence of new pathogens such as Powassan virus, understanding what distinguishes vector from non-vector species, and predicting undiscovered tick vectors is an important step towards mitigating human disease risk. We apply generalized boosted regression to interrogate over 90 features for over 240 species of Ixodes ticks. Our model predicted vector status with ~97% accuracy and implicated 14 tick species whose intrinsic trait profiles confer high probabilities (~80%) that they are capable of transmitting infections from animal hosts to humans. Distinguishing characteristics of zoonotic tick vectors include several anatomical structures that facilitate efficient host seeking and blood-feeding from a wide variety of host species. Boosted regression analysis produced both actionable predictions to guide ongoing surveillance as well as testable hypotheses about the biological underpinnings of vectorial capacity across tick species.
Alzheimers disease (AD) and Parkinsons disease (PD) are the two most common neurodegenerative disorders in humans. Because a significant percentage of patients have clinical and pathological features of both diseases, it has been hypothesized that the patho-cascades of the two diseases overlap. Despite this evidence, these two diseases are rarely studied in a joint manner. In this paper, we utilize clinical, imaging, genetic, and biospecimen features to cluster AD and PD patients into the same feature space. By training a machine learning classifier on the combined feature space, we predict the disease stage of patients two years after their baseline visits. We observed a considerable improvement in the prediction accuracy of Parkinsons dementia patients due to combined training on Alzheimers and Parkinsons patients, thereby affirming the claim that these two diseases can be jointly studied.
We present the findings of The Alzheimers Disease Prediction Of Longitudinal Evolution (TADPOLE) Challenge, which compared the performance of 92 algorithms from 33 international teams at predicting the future trajectory of 219 individuals at risk of Alzheimers disease. Challenge participants were required to make a prediction, for each month of a 5-year future time period, of three key outcomes: clinical diagnosis, Alzheimers Disease Assessment Scale Cognitive Subdomain (ADAS-Cog13), and total volume of the ventricles. No single submission was best at predicting all three outcomes. For clinical diagnosis and ventricle volume prediction, the best algorithms strongly outperform simple baselines in predictive ability. However, for ADAS-Cog13 no single submitted prediction method was significantly better than random guessing. Two ensemble methods based on taking the mean and median over all predictions, obtained top scores on almost all tasks. Better than average performance at diagnosis prediction was generally associated with the additional inclusion of features from cerebrospinal fluid (CSF) samples and diffusion tensor imaging (DTI). On the other hand, better performance at ventricle volume prediction was associated with inclusion of summary statistics, such as patient-specific biomarker trends. The submission system remains open via the website https://tadpole.grand-challenge.org, while code for submissions is being collated by TADPOLE SHARE: https://tadpole-share.github.io/. Our work suggests that current prediction algorithms are accurate for biomarkers related to clinical diagnosis and ventricle volume, opening up the possibility of cohort refinement in clinical trials for Alzheimers disease.
The technique of Formal Concept Analysis is applied to a dataset describing the traits of rodents, with the goal of identifying zoonotic disease carriers,or those species carrying infections that can spillover to cause human disease. The concepts identified among these species together provide rules-of-thumb about the intrinsic biological features of rodents that carry zoonotic diseases, and offer utility for better targeting field surveillance efforts in the search for novel disease carriers in the wild.
Traditionally, expert epidemiologists devise policies for disease control through a mixture of intuition and brute force. Namely, they use their know-how to narrow down the set of logically conceivable policies to a small family described by a few parameters, following which they conduct a grid search to identify the optimal policy within the set. This scheme is not scalable, in the sense that, when used to optimize over policies which depend on many parameters, it will likely fail to output an optimal disease policy in time for its implementation. In this article, we use techniques from convex optimization theory and machine learning to conduct optimizations over disease policies described by hundreds of parameters. In contrast to past approaches for policy optimization based on control theory, our framework can deal with arbitrary uncertainties on the initial conditions and model parameters controlling the spread of the disease. In addition, our methods allow for optimization over weekly-constant policies, specified by either continuous or discrete government measures (e.g.: lockdown on/off). We illustrate our approach by minimizing the total time required to eradicate COVID-19 within the Susceptible-Exposed-Infected-Recovered (SEIR) model proposed by Kissler emph{et al.} (March, 2020).
We study metapopulation models for the spread of epidemics in which different subpopulations (cities) are connected by fluxes of individuals (travelers). This framework allows to describe the spread of a disease on a large scale and we focus here on the computation of the arrival time of a disease as a function of the properties of the seed of the epidemics and of the characteristics of the network connecting the various subpopulations. Using analytical and numerical arguments, we introduce an easily computable quantity which approximates this average arrival time. We show on the example of a disease spread on the world-wide airport network that this quantity predicts with a good accuracy the order of arrival of the disease in the various subpopulations in each realization of epidemic scenario, and not only for an average over realizations. Finally, this quantity might be useful in the identification of the dominant paths of the disease spread.