Balancing spatial and non-spatial variation in varying coefficient modeling: a remedy for spurious correlation

64 0 0.0 ( 0 )

Download Cite

Added by Daisuke Murakami

Publication date 2020

fields Mathematical Statistics

and research's language is English

Authors Daisuke Murakami - Daniel A. Griffith

Applications

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This study discusses the importance of balancing spatial and non-spatial variation in spatial regression modeling. Unlike spatially varying coefficients (SVC) modeling, which is popular in spatial statistics, non-spatially varying coefficients (NVC) modeling has largely been unexplored in spatial fields. Nevertheless, as we will explain, consideration of non-spatial variation is needed not only to improve model accuracy but also to reduce spurious correlation among varying coefficients, which is a major problem in SVC modeling. We consider a Moran eigenvector approach modeling spatially and non-spatially varying coefficients (S&NVC). A Monte Carlo simulation experiment comparing our S&NVC model with existing SVC models suggests both modeling accuracy and computational efficiency for our approach. Beyond that, somewhat surprisingly, our approach identifies true and spurious correlations among coefficients nearly perfectly, even when usual SVC models suffer from severe spurious correlations. It implies that S&NVC model should be used even when the analysis purpose is modeling SVCs. Finally, our S&NVC model is employed to analyze a residential land price dataset. Its results suggest existence of both spatial and non-spatial variation in regression coefficients in practice. The S&NVC model is now implemented in the R package spmoran.

rate research

Bayesian Time Varying Coefficient Model with Applications to Marketing Mix Modeling

114 - Edwin Ng , Zhishi Wang , Athena Dai 2021

Both Bayesian and varying coefficient models are very useful tools in practice as they can be used to model parameter heterogeneity in a generalizable way. Motivated by the need of enhancing Marketing Mix Modeling at Uber, we propose a Bayesian Time Varying Coefficient model, equipped with a hierarchical Bayesian structure. This model is different from other time varying coefficient models in the sense that the coefficients are weighted over a set of local latent variables following certain probabilistic distributions. Stochastic Variational Inference is used to approximate the posteriors of latent variables and dynamic coefficients. The proposed model also helps address many challenges faced by traditional MMM approaches. We used simulations as well as real world marketing datasets to demonstrate our model superior performance in terms of both accuracy and interpretability.

Applications Methodology

A multivariate semiparametric Bayesian spatial modeling framework for hurricane surface wind fields

419 - Brian J. Reich , Montserrat Fuentes 2007

Storm surge, the onshore rush of sea water caused by the high winds and low pressure associated with a hurricane, can compound the effects of inland flooding caused by rainfall, leading to loss of property and loss of life for residents of coastal areas. Numerical ocean models are essential for creating storm surge forecasts for coastal areas. These models are driven primarily by the surface wind forcings. Currently, the gridded wind fields used by ocean models are specified by deterministic formulas that are based on the central pressure and location of the storm center. While these equations incorporate important physical knowledge about the structure of hurricane surface wind fields, they cannot always capture the asymmetric and dynamic nature of a hurricane. A new Bayesian multivariate spatial statistical modeling framework is introduced combining data with physical knowledge about the wind fields to improve the estimation of the wind vectors. Many spatial models assume the data follow a Gaussian distribution. However, this may be overly-restrictive for wind fields data which often display erratic behavior, such as sudden changes in time or space. In this paper we develop a semiparametric multivariate spatial model for these data. Our model builds on the stick-breaking prior, which is frequently used in Bayesian modeling to capture uncertainty in the parametric form of an outcome. The stick-breaking prior is extended to the spatial setting by assigning each location a different, unknown distribution, and smoothing the distributions in space with a series of kernel functions. This semiparametric spatial model is shown to improve prediction compared to usual Bayesian Kriging methods for the wind field of Hurricane Ivan.

Applications

A Spatial Concordance Correlation Coefficient with an Application to Image Analysis

55 - Ronny Vallejos , Javier Perez , Aaron M. Ellison 2019

In this work we define a spatial concordance coefficient for second-order stationary processes. This problem has been widely addressed in a non-spatial context, but here we consider a coefficient that for a fixed spatial lag allows one to compare two spatial sequences along a 45-degree line. The proposed coefficient was explored for the bivariate Matern and Wendland covariance functions. The asymptotic normality of a sample version of the spatial concordance coefficient for an increasing domain sampling framework was established for the Wendland covariance function. To work with large digital images, we developed a local approach for estimating the concordance that uses local spatial models on non-overlapping windows. Monte Carlo simulations were used to gain additional insights into the asymptotic properties for finite sample sizes. As an illustrative example, we applied this methodology to two similar images of a deciduous forest canopy. The images were recorded with different cameras but similar fields-of-view and within minutes of each other. Our analysis showed that the local approach helped to explain a percentage of the non-spatial concordance and to provided additional information about its decay as a function of the spatial lag.

Methodology

The Bayesian Spatial Bradley--Terry Model: Urban Deprivation Modeling in Tanzania

183 - R. G. Seymour , D. Sirl , S. Preston 2020

Identifying the most deprived regions of any country or city is key if policy makers are to design successful interventions. However, locating areas with the greatest need is often surprisingly challenging in developing countries. Due to the logistical challenges of traditional household surveying, official statistics can be slow to be updated; estimates that exist can be coarse, a consequence of prohibitive costs and poor infrastructures; and mass urbanisation can render manually surveyed figures rapidly out-of-date. Comparative judgement models, such as the Bradley--Terry model, offer a promising solution. Leveraging local knowledge, elicited via comparisons of different areas affluence, such models can both simplify logistics and circumvent biases inherent to house-hold surveys. Yet widespread adoption remains limited, due to the large amount of data existing approaches still require. We address this via development of a novel Bayesian Spatial Bradley--Terry model, which substantially decreases the amount of data comparisons required for effective inference. This model integrates a network representation of the city or country, along with assumptions of spatial smoothness that allow deprivation in one area to be informed by neighbouring areas. We demonstrate the practical effectiveness of this method, through a novel comparative judgement data set collected in Dar es Salaam, Tanzania.

Applications Computation Methodology

Scalable model selection for spatial additive mixed modeling: application to crime analysis

133 - Daisuke Murakami , Mami Kajita , Seiji Kajita 2020

A rapid growth in spatial open datasets has led to a huge demand for regression approaches accommodating spatial and non-spatial effects in big data. Regression model selection is particularly important to stably estimate flexible regression models. However, conventional methods can be slow for large samples. Hence, we develop a fast and practical model-selection approach for spatial regression models, focusing on the selection of coefficient types that include constant, spatially varying, and non-spatially varying coefficients. A pre-processing approach, which replaces data matrices with small inner products through dimension reduction dramatically accelerates the computation speed of model selection. Numerical experiments show that our approach selects the model accurately and computationally efficiently, highlighting the importance of model selection in the spatial regression context. Then, the present approach is applied to open data to investigate local factors affecting crime in Japan. The results suggest that our approach is useful not only for selecting factors influencing crime risk but also for predicting crime events. This scalable model selection will be key to appropriately specifying flexible and large-scale spatial regression models in the era of big data. The developed model selection approach was implemented in the R package spmoran.

Applications