No Arabic abstract
Updating observations of a signal due to the delays in the measurement process is a common problem in signal processing, with prominent examples in a wide range of fields. An important example of this problem is the nowcasting of COVID-19 mortality: given a stream of reported counts of daily deaths, can we correct for the delays in reporting to paint an accurate picture of the present, with uncertainty? Without this correction, raw data will often mislead by suggesting an improving situation. We present a flexible approach using a latent Gaussian process that is capable of describing the changing auto-correlation structure present in the reporting time-delay surface. This approach also yields robust estimates of uncertainty for the estimated nowcasted numbers of deaths. We test assumptions in model specification such as the choice of kernel or hyper priors, and evaluate model performance on a challenging real dataset from Brazil. Our experiments show that Gaussian process nowcasting performs favourably against both comparable methods, and against a small sample of expert human predictions. Our approach has substantial practical utility in disease modelling -- by applying our approach to COVID-19 mortality data from Brazil, where reporting delays are large, we can make informative predictions on important epidemiological quantities such as the current effective reproduction number.
Model selection is a fundamental part of the applied Bayesian statistical methodology. Metrics such as the Akaike Information Criterion are commonly used in practice to select models but do not incorporate the uncertainty of the models parameters and can give misleading choices. One approach that uses the full posterior distribution is to compute the ratio of two models normalising constants, known as the Bayes factor. Often in realistic problems, this involves the integration of analytically intractable, high-dimensional distributions, and therefore requires the use of stochastic methods such as thermodynamic integration (TI). In this paper we apply a variation of the TI method, referred to as referenced TI, which computes a single models normalising constant in an efficient way by using a judiciously chosen reference density. The advantages of the approach and theoretical considerations are set out, along with explicit pedagogical 1 and 2D examples. Benchmarking is presented with comparable methods and we find favourable convergence performance. The approach is shown to be useful in practice when applied to a real problem - to perform model selection for a semi-mechanistic hierarchical Bayesian model of COVID-19 transmission in South Korea involving the integration of a 200D density.
The COVID-19 pandemic has caused severe public health consequences in the United States. The United States began a vaccination campaign at the end of 2020 targeting primarily elderly residents before extending access to younger individuals. With both COVID-19 infection fatality ratios and vaccine uptake being heterogeneous across ages, an important consideration is whether the age contribution to deaths shifted over time towards younger age groups. In this study, we use a Bayesian non-parametric spatial approach to estimate the age-specific contribution to COVID-19 attributable deaths over time. The proposed spatial approach is a low-rank Gaussian Process projected by regularised B-splines. Simulation analyses and benchmark results show that the spatial approach performs better than a standard B-splines approach and equivalently well as a standard Gaussian Process, for considerably lower runtimes. We find that COVID-19 has been especially deadly in the United States. The mortality rates among individuals aged 85+ ranged from 1% to 5% across the US states. Since the beginning of the vaccination campaign, the number of weekly deaths reduced in every US state with a faster decrease among individuals aged 75+ than individuals aged 0-74. Simultaneously to this reduction, the contribution of individuals age 75+ to deaths decreased, with important disparities in the timing and rapidity of this decrease across the country.
The COVID-19 pandemic has created an urgent need for robust, scalable monitoring tools supporting stratification of high-risk patients. This research aims to develop and validate prediction models, using the UK Biobank, to estimate COVID-19 mortality risk in confirmed cases. From the 11,245 participants testing positive for COVID-19, we develop a data-driven random forest classification model with excellent performance (AUC: 0.91), using baseline characteristics, pre-existing conditions, symptoms, and vital signs, such that the score could dynamically assess mortality risk with disease deterioration. We also identify several significant novel predictors of COVID-19 mortality with equivalent or greater predictive value than established high-risk comorbidities, such as detailed anthropometrics and prior acute kidney failure, urinary tract infection, and pneumonias. The model design and feature selection enables utility in outpatient settings. Possible applications include supporting individual-level risk profiling and monitoring disease progression across patients with COVID-19 at-scale, especially in hospital-at-home settings.
As the second wave in India mitigates, COVID-19 has now infected about 29 million patients countrywide, leading to more than 350 thousand people dead. As the infections surged, the strain on the medical infrastructure in the country became apparent. While the country vaccinates its population, opening up the economy may lead to an increase in infection rates. In this scenario, it is essential to effectively utilize the limited hospital resources by an informed patient triaging system based on clinical parameters. Here, we present two interpretable machine learning models predicting the clinical outcomes, severity, and mortality, of the patients based on routine non-invasive surveillance of blood parameters from one of the largest cohorts of Indian patients at the day of admission. Patient severity and mortality prediction models achieved 86.3% and 88.06% accuracy, respectively, with an AUC-ROC of 0.91 and 0.92. We have integrated both the models in a user-friendly web app calculator, https://triage-COVID-19.herokuapp.com/, to showcase the potential deployment of such efforts at scale.
Timely estimation of the current value for COVID-19 reproduction factor $R$ has become a key aim of efforts to inform management strategies. $R$ is an important metric used by policy-makers in setting mitigation levels and is also important for accurate modelling of epidemic progression. This brief paper introduces a method for estimating $R$ from biased case testing data. Using testing data, rather than hospitalisation or death data, provides a much earlier metric along the symptomatic progression scale. This can be hugely important when fighting the exponential nature of an epidemic. We develop a practical estimator and apply it to Scottish case testing data to infer a current (20 May 2020) $R$ value of $0.74$ with $95%$ confidence interval $[0.48 - 0.86]$.