A Multi-Resolution Spatial Model for Large Datasets Based on the Skew-t Distribution

92 0 0.0 ( 0 )

Download Cite

Added by Felipe Tagle

Publication date 2017

fields Mathematical Statistics

and research's language is English

Authors Felipe Tagle - Stefano Castruccio - Marc G. Genton

Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Large, non-Gaussian spatial datasets pose a considerable modeling challenge as the dependence structure implied by the model needs to be captured at different scales, while retaining feasible inference. Skew-normal and skew-t distributions have only recently begun to appear in the spatial statistics literature, without much consideration, however, for the ability to capture dependence at multiple resolutions, and simultaneously achieve feasible inference for increasingly large data sets. This article presents the first multi-resolution spatial model inspired by the skew-t distribution, where a large-scale effect follows a multivariate normal distribution and the fine-scale effects follow a multivariate skew-normal distributions. The resulting marginal distribution for each region is skew-t, thereby allowing for greater flexibility in capturing skewness and heavy tails characterizing many environmental datasets. Likelihood-based inference is performed using a Monte Carlo EM algorithm. The model is applied as a stochastic generator of daily wind speeds over Saudi Arabia.

rate research

A class of multi-resolution approximations for large spatial datasets

140 - Matthias Katzfuss , Wenlong Gong 2017

Gaussian processes are popular and flexible models for spatial, temporal, and functional data, but they are computationally infeasible for large datasets. We discuss Gaussian-process approximations that use basis functions at multiple resolutions to achieve fast inference and that can (approximately) represent any spatial covariance structure. We consider two special cases of this multi-resolution-approximation framework, a taper version and a domain-partitioning (block) version. We describe theoretical properties and inference procedures, and study the computational complexity of the methods. Numerical comparisons and an application to satellite data are also provided.

Methodology Computation

A multi-resolution approximation for massive spatial datasets

450 - Matthias Katzfuss 2015

Automated sensing instruments on satellites and aircraft have enabled the collection of massive amounts of high-resolution observations of spatial fields over large spatial regions. If these datasets can be efficiently exploited, they can provide new insights on a wide variety of issues. However, traditional spatial-statistical techniques such as kriging are not computationally feasible for big datasets. We propose a multi-resolution approximation (M-RA) of Gaussian processes observed at irregular locations in space. The M-RA process is specified as a linear combination of basis functions at multiple levels of spatial resolution, which can capture spatial structure from very fine to very large scales. The basis functions are automatically chosen to approximate a given covariance function, which can be nonstationary. All computations involving the M-RA, including parameter inference and prediction, are highly scalable for massive datasets. Crucially, the inference algorithms can also be parallelized to take full advantage of large distributed-memory computing environments. In comparisons using simulated data and a large satellite dataset, the M-RA outperforms a related state-of-the-art method.

Methodology Computation

Robust mixture of experts modeling using the skew $t$ distribution

127 - Faicel Chamroukhi 2016

Mixture of Experts (MoE) is a popular framework in the fields of statistics and machine learning for modeling heterogeneity in data for regression, classification and clustering. MoE for continuous data are usually based on the normal distribution. However, it is known that for data with asymmetric behavior, heavy tails and atypical observations, the use of the normal distribution is unsuitable. We introduce a new robust non-normal mixture of experts modeling using the skew $t$ distribution. The proposed skew $t$ mixture of experts, named STMoE, handles these issues of the normal mixtures experts regarding possibly skewed, heavy-tailed and noisy data. We develop a dedicated expectation conditional maximization (ECM) algorithm to estimate the model parameters by monotonically maximizing the observed data log-likelihood. We describe how the presented model can be used in prediction and in model-based clustering of regression data. Numerical experiments carried out on simulated data show the effectiveness and the robustness of the proposed model in fitting non-linear regression functions as well as in model-based clustering. Then, the proposed model is applied to the real-world data of tone perception for musical data analysis, and the one of temperature anomalies for the analysis of climate change data. The obtained results confirm the usefulness of the model for practical data analysis applications.

Methodology Machine Learning Machine Learning

A Non-Gaussian Spatio-Temporal Model for Daily Wind Speeds Based on a Multivariate Skew-t Distribution

124 - Felipe Tagle , Stefano Castruccio , Paola Crippa 2017

Facing increasing domestic energy consumption from population growth and industrialization, Saudi Arabia is aiming to reduce its reliance on fossil fuels and to broaden its energy mix by expanding investment in renewable energy sources, including wind energy. A preliminary task in the development of wind energy infrastructure is the assessment of wind energy potential, a key aspect of which is the characterization of its spatio-temporal behavior. In this study we examine the impact of internal climate variability on seasonal wind power density fluctuations over Saudi Arabia using 30 simulations from the Large Ensemble Project (LENS) developed at the National Center for Atmospheric Research. Furthermore, a spatio-temporal model for daily wind speed is proposed with neighbor-based cross-temporal dependence, and a multivariate skew-t distribution to capture the spatial patterns of higher order moments. The model can be used to generate synthetic time series over the entire spatial domain that adequately reproduce the internal variability of the LENS dataset.

Applications

On the Lp-quantiles for the Student t distribution

53 - Mauro Bernardi , Valeria Bignozzi , Lea Petrella 2016

L_p-quantiles represent an important class of generalised quantiles and are defined as the minimisers of an expected asymmetric power function, see Chen (1996). For p=1 and p=2 they correspond respectively to the quantiles and the expectiles. In his paper Koenker (1993) showed that the tau quantile and the tau expectile coincide for every tau in (0,1) for a class of rescaled Student t distributions with two degrees of freedom. Here, we extend this result proving that for the Student t distribution with p degrees of freedom, the tau quantile and the tau L_p-quantile coincide for every tau in (0,1) and the same holds for any affine transformation. Furthermore, we investigate the properties of L_p-quantiles and provide recursive equations for the truncated moments of the Student t distribution.

Methodology