No Arabic abstract
Atmospheric trace-gas inversion refers to any technique used to predict spatial and temporal fluxes using mole-fraction measurements and atmospheric simulations obtained from computer models. Studies to date are most often of a data-assimilation flavour, which implicitly consider univariate statistical models with the flux as the variate of interest. This univariate approach typically assumes that the flux field is either a spatially correlated Gaussian process or a spatially uncorrelated non-Gaussian process with prior expectation fixed using flux inventories (e.g., NAEI or EDGAR in Europe). Here, we extend this approach in three ways. First, we develop a bivariate model for the mole-fraction field and the flux field. The bivariate approach allows optimal prediction of both the flux field and the mole-fraction field, and it leads to significant computational savings over the univariate approach. Second, we employ a lognormal spatial process for the flux field that captures both the lognormal characteristics of the flux field (when appropriate) and its spatial dependence. Third, we propose a new, geostatistical approach to incorporate the flux inventories in our updates, such that the posterior spatial distribution of the flux field is predominantly data-driven. The approach is illustrated on a case study of methane (CH$_4$) emissions in the United Kingdom and Ireland.
Atmospheric trace-gas inversion is the procedure by which the sources and sinks of a trace gas are identified from observations of its mole fraction at isolated locations in space and time. This is inherently a spatio-temporal bivariate inversion problem, since the mole-fraction field evolves in space and time and the flux is also spatio-temporally distributed. Further, the bivariate model is likely to be non-Gaussian since the flux field is rarely Gaussian. Here, we use conditioning to construct a non-Gaussian bivariate model, and we describe some of its properties through auto- and cross-cumulant functions. A bivariate non-Gaussian, specifically trans-Gaussian, model is then achieved through the use of Box--Cox transformations, and we facilitate Bayesian inference by approximating the likelihood in a hierarchical framework. Trace-gas inversion, especially at high spatial resolution, is frequently highly sensitive to prior specification. Therefore, unlike conventional approaches, we assimilate trace-gas inventory information with the observational data at the parameter layer, thus shifting prior sensitivity from the inventory itself to its spatial characteristics (e.g., its spatial length scale). We demonstrate the approach in controlled-experiment studies of methane inversion, using fluxes extracted from inventories of the UK and Ireland and of Northern Australia.
Despite of the great efforts during the censuses, occurrence of some nonsampling errors such as coverage error is inevitable. Coverage error which can be classified into two types of under-count and overcount occurs when there is no unique bijective (one-to-one) mapping between the individuals from the census count and the target population -- individuals who usually reside in the country (de jure residences). There are variety of reasons make the coverage error happens including deficiencies in the census maps, errors in the field operations or disinclination of people for participation in the undercount situation and multiple enumeration of individuals or those who do not belong to the scope of the census in the overcount situation. A routine practice for estimating the net coverage error is subtracting the census count from the estimated true population, which obtained from a dual system (or capture-recapture) technique. Estimated coverage error usually suffers from significant uncertainty of the direct estimate of true population or other errors such as matching error. To rectify the above-mentioned problem and predict a more reliable coverage error rate, we propose a set of spatio-temporal mixed models. In an illustrative study on the 2010 census coverage error rate of the U.S. counties with population more than 100,000, we select the best mixed model for prediction by deviance information criteria (DIC) and conditional predictive ordinate (CPO). Our proposed approach for predicting coverage error rate and its measure of uncertainty is a full Bayesian approach, which leads to a reasonable improvement over the direct coverage error rate in terms of mean squared error (MSE) and confidence interval (CI) as provided by the U.S. Census Bureau.
Spatio-temporal systems exhibiting multi-scale behaviour are common in applications ranging from cyber-physical systems to systems biology, yet they present formidable challenges for computational modelling and analysis. Here we consider a prototypic scenario where spatially distributed agents decide their movement based on external inputs and a fast-equilibrating internal computation. We propose a generally applicable strategy based on statistically abstracting the internal system using Gaussian Processes, a powerful class of non-parametric regression techniques from Bayesian Machine Learning. We show on a running example of bacterial chemotaxis that this approach leads to accurate and much faster simulations in a variety of scenarios.
Crime prediction plays an impactful role in enhancing public security and sustainable development of urban. With recent advances in data collection and integration technologies, a large amount of urban data with rich crime-related information and fine-grained spatio-temporal logs has been recorded. Such helpful information can boost our understandings about the temporal evolution and spatial factors of urban crimes and can enhance accurate crime prediction. In this paper, we perform crime prediction exploiting the cross-type and spatio-temporal correlations of urban crimes. In particular, we verify the existence of correlations among different types of crime from temporal and spatial perspectives, and propose a coherent framework to mathematically model these correlations for crime prediction. The extensive experimental results on real-world data validate the effectiveness of the proposed framework. Further experiments have been conducted to understand the importance of different correlations in crime prediction.
Ice sheet models are used to study the deglaciation of North America at the end of the last ice age (past 21,000 years), so that we might understand whether and how existing ice sheets may reduce or disappear under climate change. Though ice sheet models have a few parameters controlling physical behaviour of the ice mass, they also require boundary conditions for climate (spatio-temporal fields of temperature and precipitation, typically on regular grids and at monthly intervals). The behaviour of the ice sheet is highly sensitive to these fields, and there is relatively little data from geological records to constrain them as the land was covered with ice. We develop a methodology for generating a range of plausible boundary conditions, using a low-dimensional basis representation of the spatio-temporal input. We derive this basis by combining key patterns, extracted from a small ensemble of climate model simulations of the deglaciation, with sparse spatio-temporal observations. By jointly varying the ice sheet parameters and basis vector coefficients, we run ensembles of the Glimmer ice sheet model that simultaneously explore both climate and ice sheet model uncertainties. We use these to calibrate the ice sheet physics and boundary conditions for Glimmer, by ruling out regions of the joint coefficient and parameter space via history matching. We use binary ice/no ice observations from reconstructions of past ice sheet margin position to constrain this space by introducing a novel metric for history matching to binary data.