No Arabic abstract
With the advent of wide-spread global and continental-scale spatiotemporal datasets, increased attention has been given to covariance functions on spheres over time. This paper provides results for stationary covariance functions of random fields defined over $d$-dimensional spheres cross time. Specifically, we provide a bridge between the characterization in cite{berg-porcu} for covariance functions on spheres cross time and Gneitings lemma citep{gneiting2002} that deals with planar surfaces. We then prove that there is a valid class of covariance functions similar in form to the Gneiting class of space-time covariance functions citep{gneiting2002} that replaces the squared Euclidean distance with the great circle distance. Notably, the provided class is shown to be positive definite on every $d$-dimensional sphere cross time, while the Gneiting class is positive definite over $R^d times R$ for fixed $d$ only. In this context, we illustrate the value of our adapted Gneiting class by comparing examples from this class to currently established nonseparable covariance classes using out-of-sample predictive criteria. These comparisons are carried out on two climate reanalysis datasets from the National Centers for Environmental Prediction and National Center for Atmospheric Research. For these datasets, we show that examples from our covariance class have better predictive performance than competing models.
Multivariate space-time data are increasingly available in various scientific disciplines. When analyzing these data, one of the key issues is to describe the multivariate space-time dependencies. Under the Gaussian framework, one needs to propose relevant models for multivariate space-time covariance functions, i.e. matrix-valued mappings with the additional requirement of non-negative definiteness. We propose a flexible parametric class of cross-covariance functions for multivariate space-time Gaussian random fields. Space-time components belong to the (univariate) Gneiting class of space-time covariance functions, with Matern or Cauchy covariance functions in the spatial margins. The smoothness and scale parameters can be different for each variable. We provide sufficient conditions for positive definiteness. A simulation study shows that the parameters of this model can be efficiently estimated using weighted pairwise likelihood, which belongs to the class of composite likelihood methods. We then illustrate the model on a French dataset of weather variables.
The assumption of separability of the covariance operator for a random image or hypersurface can be of substantial use in applications, especially in situations where the accurate estimation of the full covariance structure is unfeasible, either for computational reasons, or due to a small sample size. However, inferential tools to verify this assumption are somewhat lacking in high-dimensional or functional {data analysis} settings, where this assumption is most relevant. We propose here to test separability by focusing on $K$-dimensional projections of the difference between the covariance operator and a nonparametric separable approximation. The subspace we project onto is one generated by the eigenfunctions of the covariance operator estimated under the separability hypothesis, negating the need to ever estimate the full non-separable covariance. We show that the rescaled difference of the sample covariance operator with its separable approximation is asymptotically Gaussian. As a by-product of this result, we derive asymptotically pivotal tests under Gaussian assumptions, and propose bootstrap methods for approximating the distribution of the test statistics. We probe the finite sample performance through simulations studies, and present an application to log-spectrogram images from a phonetic linguistics dataset.
We propose a Bayesian methodology for estimating spiked covariance matrices with jointly sparse structure in high dimensions. The spiked covariance matrix is reparametrized in terms of the latent factor model, where the loading matrix is equipped with a novel matrix spike-and-slab LASSO prior, which is a continuous shrinkage prior for modeling jointly sparse matrices. We establish the rate-optimal posterior contraction for the covariance matrix with respect to the operator norm as well as that for the principal subspace with respect to the projection operator norm loss. We also study the posterior contraction rate of the principal subspace with respect to the two-to-infinity norm loss, a novel loss function measuring the distance between subspaces that is able to capture element-wise eigenvector perturbations. We show that the posterior contraction rate with respect to the two-to-infinity norm loss is tighter than that with respect to the routinely used projection operator norm loss under certain low-rank and bounded coherence conditions. In addition, a point estimator for the principal subspace is proposed with the rate-optimal risk bound with respect to the projection operator norm loss. These results are based on a collection of concentration and large deviation inequalities for the matrix spike-and-slab LASSO prior. The numerical performance of the proposed methodology is assessed through synthetic examples and the analysis of a real-world face data example.
The Mat{e}rn family of isotropic covariance functions has been central to the theoretical development and application of statistical models for geospatial data. For global data defined over the whole sphere representing planet Earth, the natural distance between any two locations is the great circle distance. In this setting, the Mat{e}rn family of covariance functions has a restriction on the smoothness parameter, making it an unappealing choice to model smooth data. Finding a suitable analogue for modelling data on the sphere is still an open problem. This paper proposes a new family of isotropic covariance functions for random fields defined over the sphere. The proposed family has a parameter that indexes the mean square differentiability of the corresponding Gaussian field, and allows for any admissible range of fractal dimension. Our simulation study mimics the fixed domain asymptotic setting, which is the most natural regime for sampling on a closed and bounded set. As expected, our results support the analogous results (under the same asymptotic scheme) for planar processes that not all parameters can be estimated consistently. We apply the proposed model to a dataset of precipitable water content over a large portion of the Earth, and show that the model gives more precise predictions of the underlying process at unsampled locations than does the Mat{e}rn model using chordal distances.
We offer a survey of recent results on covariance estimation for heavy-tailed distributions. By unifying ideas scattered in the literature, we propose user-friendly methods that facilitate practical implementation. Specifically, we introduce element-wise and spectrum-wise truncation operators, as well as their $M$-estimator counterparts, to robustify the sample covariance matrix. Different from the classical notion of robustness that is characterized by the breakdown property, we focus on the tail robustness which is evidenced by the connection between nonasymptotic deviation and confidence level. The key observation is that the estimators needs to adapt to the sample size, dimensionality of the data and the noise level to achieve optimal tradeoff between bias and robustness. Furthermore, to facilitate their practical use, we propose data-driven procedures that automatically calibrate the tuning parameters. We demonstrate their applications to a series of structured models in high dimensions, including the bandable and low-rank covariance matrices and sparse precision matrices. Numerical studies lend strong support to the proposed methods.