No Arabic abstract
Multivariate measurements taken at different spatial locations occur frequently in practice. Proper analysis of such data needs to consider not only dependencies on-sight but also dependencies in and in-between variables as a function of spatial separation. Spatial Blind Source Separation (SBSS) is a recently developed unsupervised statistical tool that deals with such data by assuming that the observable data is formed by a linear latent variable model. In SBSS the latent variable is assumed to be constituted by weakly stationary random fields which are uncorrelated. Such a model is appealing as further analysis can be carried out on the marginal distributions of the latent variables, interpretations are straightforward as the model is assumed to be linear, and not all components of the latent field might be of interest which acts as a form of dimension reduction. The weakly stationarity assumption of SBSS implies that the mean of the data is constant for all sample locations, which might be too restricting in practical applications. Therefore, an adaptation of SBSS that uses scatter matrices based on differences was recently suggested in the literature. In our contribution we formalize these ideas, suggest an adapted SBSS method and show its usefulness on synthetic and real data.
Recently a blind source separation model was suggested for spatial data together with an estimator based on the simultaneous diagonalisation of two scatter matrices. The asymptotic properties of this estimator are derived here and a new estimator, based on the joint diagonalisation of more than two scatter matrices, is proposed. The asymptotic properties and merits of the novel estimator are verified in simulation studies. A real data example illustrates the method.
Regional data analysis is concerned with the analysis and modeling of measurements that are spatially separated by specifically accounting for typical features of such data. Namely, measurements in close proximity tend to be more similar than the ones further separated. This might hold also true for cross-dependencies when multivariate spatial data is considered. Often, scientists are interested in linear transformations of such data which are easy to interpret and might be used as dimension reduction. Recently, for that purpose spatial blind source separation (SBSS) was introduced which assumes that the observed data are formed by a linear mixture of uncorrelated, weakly stationary random fields. However, in practical applications, it is well-known that when the spatial domain increases in size the weak stationarity assumptions can be violated in the sense that the second order dependency is varying over the domain which leads to non-stationary analysis. In our work we extend the SBSS model to adjust for these stationarity violations, present three novel estimators and establish the identifiability and affine equivariance property of the unmixing matrix functionals defining these estimators. In an extensive simulation study, we investigate the performance of our estimators and also show their use in the analysis of a geochemical dataset which is derived from the GEMAS geochemical mapping project.
We assume a spatial blind source separation model in which the observed multivariate spatial data is a linear mixture of latent spatially uncorrelated Gaussian random fields containing a number of pure white noise components. We propose a test on the number of white noise components and obtain the asymptotic distribution of its statistic for a general domain. We also demonstrate how computations can be facilitated in the case of gridded observation locations. Based on this test, we obtain a consistent estimator of the true dimension. Simulation studies and an environmental application demonstrate that our test is at least comparable to and often outperforms bootstrap-based techniques, which are also introduced in this paper.
Multivariate measurements taken at irregularly sampled locations are a common form of data, for example in geochemical analysis of soil. In practical considerations predictions of these measurements at unobserved locations are of great interest. For standard multivariate spatial prediction methods it is mandatory to not only model spatial dependencies but also cross-dependencies which makes it a demanding task. Recently, a blind source separation approach for spatial data was suggested. When using this spatial blind source separation method prior the actual spatial prediction, modelling of spatial cross-dependencies is avoided, which in turn simplifies the spatial prediction task significantly. In this paper we investigate the use of spatial blind source separation as a pre-processing tool for spatial prediction and compare it with predictions from Cokriging and neural networks in an extensive simulation study as well as a geochemical dataset.
Unsupervised blind source separation methods do not require a training phase and thus cannot suffer from a train-test mismatch, which is a common concern in neural network based source separation. The unsupervised techniques can be categorized in two classes, those building upon the sparsity of speech in the Short-Time Fourier transform domain and those exploiting non-Gaussianity or non-stationarity of the source signals. In this contribution, spatial mixture models which fall in the first category and independent vector analysis (IVA) as a representative of the second category are compared w.r.t. their separation performance and the performance of a downstream speech recognizer on a reverberant dataset of reasonable size. Furthermore, we introduce a serial concatenation of the two, where the result of the mixture model serves as initialization of IVA, which achieves significantly better WER performance than each algorithm individually and even approaches the performance of a much more complex neural network based technique.