ترغب بنشر مسار تعليمي؟ اضغط هنا

Multi-resolution Spatial Regression for Aggregated Data with an Application to Crop Yield Prediction

96   0   0.0 ( 0 )
 نشر من قبل Harrison Zhu
 تاريخ النشر 2021
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

We develop a new methodology for spatial regression of aggregated outputs on multi-resolution covariates. Such problems often occur with spatial data, for example in crop yield prediction, where the output is spatially-aggregated over an area and the covariates may be observed at multiple resolutions. Building upon previous work on aggregated output regression, we propose a regression framework to synthesise the effects of the covariates at different resolutions on the output and provide uncertainty estimation. We show that, for a crop yield prediction problem, our approach is more scalable, via variational inference, than existing multi-resolution regression models. We also show that our framework yields good predictive performance, compared to existing multi-resolution crop yield models, whilst being able to provide estimation of the underlying spatial effects.



قيم البحث

اقرأ أيضاً

This paper proposes a spatio-temporal model for wind speed prediction which can be run at different resolutions. The model assumes that the wind prediction of a cluster is correlated to its upstream influences in recent history, and the correlation b etween clusters is represented by a directed dynamic graph. A Bayesian approach is also described in which prior beliefs about the predictive errors at different data resolutions are represented in a form of Gaussian processes. The joint framework enhances the predictive performance by combining results from predictions at different data resolution and provides reasonable uncertainty quantification. The model is evaluated on actual wind data from the Midwest U.S. and shows a superior performance compared to traditional baselines.
Gaussian random fields have been one of the most popular tools for analyzing spatial data. However, many geophysical and environmental processes often display non-Gaussian characteristics. In this paper, we propose a new class of spatial models for n on-Gaussian random fields on a sphere based on a multi-resolution analysis. Using a special wavelet frame, named spherical needlets, as building blocks, the proposed model is constructed in the form of a sparse random effects model. The spatial localization of needlets, together with carefully chosen random coefficients, ensure the model to be non-Gaussian and isotropic. The model can also be expanded to include a spatially varying variance profile. The special formulation of the model enables us to develop efficient estimation and prediction procedures, in which an adaptive MCMC algorithm is used. We investigate the accuracy of parameter estimation of the proposed model, and compare its predictive performance with that of two Gaussian models by extensive numerical experiments. Practical utility of the proposed model is demonstrated through an application of the methodology to a data set of high-latitude ionospheric electrostatic potentials, generated from the LFM-MIX model of the magnetosphere-ionosphere system.
This work is motivated by the Obepine French system for SARS-CoV-2 viral load monitoring in wastewater. The objective of this work is to identify, from time-series of noisy measurements, the underlying auto-regressive signals, in a context where the measurements present numerous missing data, censoring and outliers. We propose a method based on an auto-regressive model adapted to censored data with outliers. Inference and prediction are produced via a discretised smoother. This method is both validated on simulations and on real data from Obepine. The proposed method is used to denoise measurements from the quantification of the SARS-CoV-2 E gene in wastewater by RT-qPCR. The resulting smoothed signal shows a good correlation with other epidemiological indicators and an estimate of the whole system noise is produced.
Many analyses of neuroimaging data involve studying one or more regions of interest (ROIs) in a brain image. In order to do so, each ROI must first be identified. Since every brain is unique, the location, size, and shape of each ROI varies across su bjects. Thus, each ROI in a brain image must either be manually identified or (semi-) automatically delineated, a task referred to as segmentation. Automatic segmentation often involves mapping a previously manually segmented image to a new brain image and propagating the labels to obtain an estimate of where each ROI is located in the new image. A more recent approach to this problem is to propagate labels from multiple manually segmented atlases and combine the results using a process known as label fusion. To date, most label fusion algorithms either employ voting procedures or impose prior structure and subsequently find the maximum a posteriori estimator (i.e., the posterior mode) through optimization. We propose using a fully Bayesian spatial regression model for label fusion that facilitates direct incorporation of covariate information while making accessible the entire posterior distribution. We discuss the implementation of our model via Markov chain Monte Carlo and illustrate the procedure through both simulation and application to segmentation of the hippocampus, an anatomical structure known to be associated with Alzheimers disease.
Modelling disease progression of iron deficiency anaemia (IDA) following oral iron supplement prescriptions is a prerequisite for evaluating the cost-effectiveness of oral iron supplements. Electronic health records (EHRs) from the Clinical Practice Research Datalink (CPRD) provide rich longitudinal data on IDA disease progression in patients registered with 663 General Practitioner (GP) practices in the UK, but they also create challenges in statistical analyses. First, the CPRD data are clustered at multi-levels (i.e., GP practices and patients), but their large volume makes it computationally difficult to implement estimation of standard random effects models for multi-level data. Second, observation times in the CPRD data are irregular and could be informative about the disease progression. For example, shorter/longer gap times between GP visits could be associated with deteriorating/improving IDA. Existing methods to address informative observation times are mostly based on complex joint models, which adds more computational burden. To tackle these challenges, we develop a computationally efficient approach to modelling disease progression with EHRs data while accounting for variability at multi-level clusters and informative observation times. We apply the proposed method to the CPRD data to investigate IDA improvement and treatment intolerance following oral iron prescriptions in primary care of the UK.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا