No Arabic abstract
Arctic sea ice plays an important role in the global climate. Sea ice models governed by physical equations have been used to simulate the state of the ice including characteristics such as ice thickness, concentration, and motion. More recent models also attempt to capture features such as fractures or leads in the ice. These simulated features can be partially misaligned or misshapen when compared to observational data, whether due to numerical approximation or incomplete physics. In order to make realistic forecasts and improve understanding of the underlying processes, it is necessary to calibrate the numerical model to field data. Traditional calibration methods based on generalized least-square metrics are flawed for linear features such as sea ice cracks. We develop a statistical emulation and calibration framework that accounts for feature misalignment and misshapenness, which involves optimally aligning model output with observed features using cutting edge image registration techniques. This work can also have application to other physical models which produce coherent structures.
Crime prevention strategies based on early intervention depend on accurate risk assessment instruments for identifying high risk youth. It is important in this context that the instruments be convenient to administer, which means, in particular, that they must be reasonably brief; adaptive screening tests are useful for this purpose. Although item response theory (IRT) bears a long and rich history in producing reliable adaptive tests, adaptive tests constructed using classification and regression trees are becoming a popular alternative to the traditional IRT approach for item selection. On the upside, unlike IRT, tree-based questionnaires require no real-time parameter estimation during administration. On the downside, while item response theory provides robust criteria for terminating the exam, the stopping criterion for a tree-based adaptive test (the maximum tree depth) is unclear. We present a Bayesian decision theory approach for characterizing the trade-offs of administering tree-based questionnaires of different lengths. This formalism involves specifying 1) a utility function measuring the goodness of the assessment; 2) a target population over which this utility should be maximized; 3) an action space comprised of different-length assessments, populated via a tree-fitting algorithm. Using this framework, we provide uncertainty estimates for the trade-offs of shortening the exam, allowing practitioners to determine an optimal exam length in a principled way. The method is demonstrated through an application to youth delinquency risk assessment in Honduras.
Gaussian random fields have been one of the most popular tools for analyzing spatial data. However, many geophysical and environmental processes often display non-Gaussian characteristics. In this paper, we propose a new class of spatial models for non-Gaussian random fields on a sphere based on a multi-resolution analysis. Using a special wavelet frame, named spherical needlets, as building blocks, the proposed model is constructed in the form of a sparse random effects model. The spatial localization of needlets, together with carefully chosen random coefficients, ensure the model to be non-Gaussian and isotropic. The model can also be expanded to include a spatially varying variance profile. The special formulation of the model enables us to develop efficient estimation and prediction procedures, in which an adaptive MCMC algorithm is used. We investigate the accuracy of parameter estimation of the proposed model, and compare its predictive performance with that of two Gaussian models by extensive numerical experiments. Practical utility of the proposed model is demonstrated through an application of the methodology to a data set of high-latitude ionospheric electrostatic potentials, generated from the LFM-MIX model of the magnetosphere-ionosphere system.
The algorithms used for optimal management of ambulances require accurate description and prediction of the spatio-temporal evolution of emergency interventions. In the last years, several authors have proposed sophisticated statistical approaches to forecast the ambulance dispatches, typically modelling the events as a point pattern occurring on a planar region. Nevertheless, ambulance interventions can be more appropriately modelled as a realisation of a point process occurring along a network of lines, such as a road network. The constrained spatial domain raises specific challenges and unique methodological problems that cannot be ignored when developing a proper statistical model. Hence, this paper proposes a spatiotemporal model to analyse the ambulance interventions that occurred in the road network of Milan (Italy) from 2015 to 2017. We adopt a non-separable first-order intensity function with spatial and temporal terms. The temporal component is estimated semi-parametrically using a Poisson regression model, while the spatial dimension is estimated nonparametrically using a network kernel function. A set of weights is included in the spatial term to capture space-time interactions, inducing non-separability in the intensity function. A series of maps and graphical tests show that our approach successfully models the ambulance interventions and captures the space-time patterns.
This work is motivated by the Obepine French system for SARS-CoV-2 viral load monitoring in wastewater. The objective of this work is to identify, from time-series of noisy measurements, the underlying auto-regressive signals, in a context where the measurements present numerous missing data, censoring and outliers. We propose a method based on an auto-regressive model adapted to censored data with outliers. Inference and prediction are produced via a discretised smoother. This method is both validated on simulations and on real data from Obepine. The proposed method is used to denoise measurements from the quantification of the SARS-CoV-2 E gene in wastewater by RT-qPCR. The resulting smoothed signal shows a good correlation with other epidemiological indicators and an estimate of the whole system noise is produced.
We develop a new methodology for spatial regression of aggregated outputs on multi-resolution covariates. Such problems often occur with spatial data, for example in crop yield prediction, where the output is spatially-aggregated over an area and the covariates may be observed at multiple resolutions. Building upon previous work on aggregated output regression, we propose a regression framework to synthesise the effects of the covariates at different resolutions on the output and provide uncertainty estimation. We show that, for a crop yield prediction problem, our approach is more scalable, via variational inference, than existing multi-resolution regression models. We also show that our framework yields good predictive performance, compared to existing multi-resolution crop yield models, whilst being able to provide estimation of the underlying spatial effects.