No Arabic abstract
This paper presents an approach to estimating the health effects of an environmental hazard. The approach is general in nature, but is applied here to the case of air pollution. It uses a computer model involving ambient pollution and temperature inputs, to simulate the exposures experienced by individuals in an urban area, whilst incorporating the mechanisms that determine exposures. The output from the model comprises a set of daily exposures for a sample of individuals from the population of interest. These daily exposures are approximated by parametric distributions, so that the predictive exposure distribution of a randomly selected individual can be generated. These distributions are then incorporated into a hierarchical Bayesian framework (with inference using Markov Chain Monte Carlo simulation) in order to examine the relationship between short-term changes in exposures and health outcomes, whilst making allowance for long-term trends, seasonality, the effect of potential confounders and the possibility of ecological bias. The paper applies this approach to particulate pollution (PM$_{10}$) and respiratory mortality counts for seniors in greater London ($geq$65 years) during 1997. Within this substantive epidemiological study, the effects on health of ambient concentrations and (estimated) personal exposures are compared.
Air pollution constitutes the highest environmental risk factor in relation to heath. In order to provide the evidence required for health impact analyses, to inform policy and to develop potential mitigation strategies comprehensive information is required on the state of air pollution. Information on air pollution traditionally comes from ground monitoring (GM) networks but these may not be able to provide sufficient coverage and may need to be supplemented with information from other sources (e.g. chemical transport models; CTMs). However, these may only be available on grids and may not capture micro-scale features that may be important in assessing air quality in areas of high population. We develop a model that allows calibration between multiple data sources available at different levels of support by allowing the coefficients of calibration equations to vary over space and time, enabling downscaling where the data is sufficient to support it. The model is used to produce high-resolution (1km $times$ 1km) estimates of NO$_2$ and PM$_{2.5}$ across Western Europe for 2010-2016. Concentrations of both pollutants are decreasing during this period, however there remain large populations exposed to levels exceeding the WHO Air Quality Guidelines and thus air pollution remains a serious threat to health.
The analysis of data arising from environmental health studies which collect a large number of measures of exposure can benefit from using latent variable models to summarize exposure information. However, difficulties with estimation of model parameters may arise since existing fitting procedures for linear latent variable models require correctly specified residual variance structures for unbiased estimation of regression parameters quantifying the association between (latent) exposure and health outcomes. We propose an estimating equations approach for latent exposure models with longitudinal health outcomes which is robust to misspecification of the outcome variance. We show that compared to maximum likelihood, the loss of efficiency of the proposed method is relatively small when the model is correctly specified. The proposed equations formalize the ad-hoc regression on factor scores procedure, and generalize regression calibration. We propose two weighting schemes for the equations, and compare their efficiency. We apply this method to a study of the effects of in-utero lead exposure on child development.
Air pollution is a major risk factor for global health, with both ambient and household air pollution contributing substantial components of the overall global disease burden. One of the key drivers of adverse health effects is fine particulate matter ambient pollution (PM$_{2.5}$) to which an estimated 3 million deaths can be attributed annually. The primary source of information for estimating exposures has been measurements from ground monitoring networks but, although coverage is increasing, there remain regions in which monitoring is limited. Ground monitoring data therefore needs to be supplemented with information from other sources, such as satellite retrievals of aerosol optical depth and chemical transport models. A hierarchical modelling approach for integrating data from multiple sources is proposed allowing spatially-varying relationships between ground measurements and other factors that estimate air quality. Set within a Bayesian framework, the resulting Data Integration Model for Air Quality (DIMAQ) is used to estimate exposures, together with associated measures of uncertainty, on a high resolution grid covering the entire world. Bayesian analysis on this scale can be computationally challenging and here approximate Bayesian inference is performed using Integrated Nested Laplace Approximations. Model selection and assessment is performed by cross-validation with the final model offering substantial increases in predictive accuracy, particularly in regions where there is sparse ground monitoring, when compared to current approaches: root mean square error (RMSE) reduced from 17.1 to 10.7, and population weighted RMSE from 23.1 to 12.1 $mu$gm$^{-3}$. Based on summaries of the posterior distributions for each grid cell, it is estimated that 92% of the worlds population reside in areas exceeding the World Health Organizations Air Quality Guidelines.
Tropical cyclones (TCs), driven by heat exchange between the air and sea, pose a substantial risk to many communities around the world. Accurate characterization of the subsurface ocean thermal response to TC passage is crucial for accurate TC intensity forecasts and for an understanding of the role that TCs play in the global climate system. However, that characterization is complicated by the high-noise ocean environment, correlations inherent in spatio-temporal data, relative scarcity of in situ observations, and the entanglement of the TC-induced signal with seasonal signals. We present a general methodological framework that addresses these difficulties, integrating existing techniques in seasonal mean field estimation, Gaussian process modeling, and nonparametric regression into a functional ANOVA model. Importantly, we improve upon past work by properly handling seasonality, providing rigorous uncertainty quantification, and treating time as a continuous variable, rather than producing estimates that are binned in time. This functional ANOVA model is estimated using in situ subsurface temperature profiles from the Argo fleet of autonomous floats through a multi-step procedure, which (1) characterizes the upper ocean seasonal shift during the TC season; (2) models the variability in the temperature observations; (3) fits a thin plate spline using the variability estimates to account for heteroskedasticity and correlation between the observations. This spline fit reveals the ocean thermal response to TC passage. Through this framework, we obtain new scientific insights into the interaction between TCs and the ocean on a global scale, including a three-dimensional characterization of the near-surface and subsurface cooling along the TC storm track and the mixing-induced subsurface warming on the tracks right side.
The relationship between short-term exposure to air pollution and mortality or morbidity has been the subject of much recent research, in which the standard method of analysis uses Poisson linear or additive models. In this paper we use a Bayesian dynamic generalised linear model (DGLM) to estimate this relationship, which allows the standard linear or additive model to be extended in two ways: (i) the long-term trend and temporal correlation present in the health data can be modelled by an autoregressive process rather than a smooth function of calendar time; (ii) the effects of air pollution are allowed to evolve over time. The efficacy of these two extensions are investigated by applying a series of dynamic and non-dynamic models to air pollution and mortality data from Greater London. A Bayesian approach is taken throughout, and a Markov chain monte carlo simulation algorithm is presented for inference. An alternative likelihood based analysis is also presented, in order to allow a direct comparison with the only previous analysis of air pollution and health data using a DGLM.