No Arabic abstract
Suppose we have a Bayesian model which combines evidence from several different sources. We want to know which model parameters most affect the estimate or decision from the model, or which of the parameter uncertainties drive the decision uncertainty. Furthermore we want to prioritise what further data should be collected. These questions can be addressed by Value of Information (VoI) analysis, in which we estimate expected reductions in loss from learning specific parameters or collecting data of a given design. We describe the theory and practice of VoI for Bayesian evidence synthesis, using and extending ideas from health economics, computer modelling and Bayesian design. The methods are general to a range of decision problems including point estimation and choices between discrete actions. We apply them to a model for estimating prevalence of HIV infection, combining indirect information from several surveys, registers and expert beliefs. This analysis shows which parameters contribute most of the uncertainty about each prevalence estimate, and provides the expected improvements in precision from collecting specific amounts of additional data.
In this study, we begin a comprehensive characterisation of temperature extremes in Ireland for the period 1981-2010. We produce return levels of anomalies of daily maximum temperature extremes for an area over Ireland, for the 30-year period 1981-2010. We employ extreme value theory (EVT) to model the data using the generalised Pareto distribution (GPD) as part of a three-level Bayesian hierarchical model. We use predictive processes in order to solve the computationally difficult problem of modelling data over a very dense spatial field. To our knowledge, this is the first study to combine predictive processes and EVT in this manner. The model is fit using Markov chain Monte Carlo (MCMC) algorithms. Posterior parameter estimates and return level surfaces are produced, in addition to specific site analysis at synoptic stations, including Casement Aerodrome and Dublin Airport. Observational data from the period 2011-2018 is included in this site analysis to determine if there is evidence of a change in the observed extremes. An increase in the frequency of extreme anomalies, but not the severity, is observed for this period. We found that the frequency of observed extreme anomalies from 2011-2018 at the Casement Aerodrome and Phoenix Park synoptic stations exceed the upper bounds of the credible intervals from the model by 20% and 7% respectively.
Background: Predicted probabilities from a risk prediction model are inevitably uncertain. This uncertainty has mostly been studied from a statistical perspective. We apply Value of Information methodology to evaluate the decision-theoretic implications of prediction uncertainty. Methods: Adopting a Bayesian perspective, we extend the definition of the Expected Value of Perfect Information (EVPI) from decision analysis to net benefit calculations in risk prediction. EVPI is the expected gain in net benefit by using the correct predictions as opposed to predictions from a proposed model. We suggest bootstrap methods for sampling from the posterior distribution of predictions for EVPI calculation using Monte Carlo simulations. In a case study, we used subsets of data of various sizes from a clinical trial for predicting mortality after myocardial infarction to show how EVPI can be interpreted and how it changes with sample size. Results: With a sample size of 1,000, EVPI was 0 at threshold values larger than 0.6, indicating there is no point in procuring more development data for such thresholds. At thresholds of 0.4-0.6, the proposed model was not net beneficial, but EVPI was positive, indicating that obtaining more development data might be justified. Across all thresholds, the gain in net benefit by using the correct model was 24% higher than the gain by using the proposed model. EVPI declined with larger samples and was generally low with sample sizes of 4,000 or greater. We summarize an algorithm for incorporating EVPI calculations into the commonly used bootstrap method for optimism correction. Conclusion: Value of Information methods can be applied to explore decision-theoretic consequences of uncertainty in risk prediction, and can complement inferential methods when developing or validating risk prediction models.
Existing methods to estimate the prevalence of chronic hepatitis C (HCV) in New York City (NYC) are limited in scope and fail to assess hard-to-reach subpopulations with highest risk such as injecting drug users (IDUs). To address these limitations, we employ a Bayesian multi-parameter evidence synthesis model to systematically combine multiple sources of data, account for bias in certain data sources, and provide unbiased HCV prevalence estimates with associated uncertainty. Our approach improves on previous estimates by explicitly accounting for injecting drug use and including data from high-risk subpopulations such as the incarcerated, and is more inclusive, utilizing ten NYC data sources. In addition, we derive two new equations to allow age at first injecting drug use data for former and current IDUs to be incorporated into the Bayesian evidence synthesis, a first for this type of model. Our estimated overall HCV prevalence as of 2012 among NYC adults aged 20-59 years is 2.78% (95% CI 2.61-2.94%), which represents between 124,900 and 140,000 chronic HCV cases. These estimates suggest that HCV prevalence in NYC is higher than previously indicated from household surveys (2.2%) and the surveillance system (2.37%), and that HCV transmission is increasing among young injecting adults in NYC. An ancillary benefit from our results is an estimate of current IDUs aged 20-59 in NYC: 0.58% or 27,600 individuals.
Let $X:=(X_1, ldots, X_p)$ be random objects (the inputs), defined on some probability space $(Omega,{mathcal{F}}, mathbb P)$ and valued in some measurable space $E=E_1timesldots times E_p$. Further, let $Y:=Y = f(X_1, ldots, X_p)$ be the output. Here, $f$ is a measurable function from $E$ to some Hilbert space $mathbb{H}$ ($mathbb{H}$ could be either of finite or infinite dimension). In this work, we give a natural generalization of the Sobol indices (that are classically defined when $Yinmathbb R$ ), when the output belongs to $mathbb{H}$. These indices have very nice properties. First, they are invariant. under isometry and scaling. Further they can be, as in dimension $1$, easily estimated by using the so-called Pick and Freeze method. We investigate the asymptotic behaviour of such estimation scheme.
The celebrated Abakaliki smallpox data have appeared numerous times in the epidemic modelling literature, but in almost all cases only a specific subset of the data is considered. There is one previous analysis of the full data set, but this relies on approximation methods to derive a likelihood. The data themselves continue to be of interest due to concerns about the possible re-emergence of smallpox as a bioterrorism weapon. We present the first full Bayesian analysis using data-augmentation Markov chain Monte Carlo methods which avoid the need for likelihood approximations. Results include estimates of basic model parameters as well as reproduction numbers and the likely path of infection. Model assessment is carried out using simulation-based methods.