No Arabic abstract
Dialect variation is of considerable interest in linguistics and other social sciences. However, traditionally it has been studied using proxies (transcriptions) rather than acoustic recordings directly. We introduce novel statistical techniques to analyse geolocalised speech recordings and to explore the spatial variation of pronunciations continuously over the region of interest, as opposed to traditional isoglosses, which provide a discrete partition of the region. Data of this type require an explicit modeling of the variation in the mean and the covariance. Usual Euclidean metrics are not appropriate, and we therefore introduce the concept of $d$-covariance, which allows consistent estimation both in space and at individual locations. We then propose spatial smoothing for these objects which accounts for the possibly non convex geometry of the domain of interest. We apply the proposed method to data from the spoken part of the British National Corpus, deposited at the British Library, London, and we produce maps of the dialect variation over Great Britain. In addition, the methods allow for acoustic reconstruction across the domain of interest, allowing researchers to listen to the statistical analysis.
This paper extends Bayesian mortality projection models for multiple populations considering the stochastic structure and the effect of spatial autocorrelation among the observations. We explain high levels of overdispersion according to adjacent locations based on the conditional autoregressive model. In an empirical study, we compare different hierarchical projection models for the analysis of geographical diversity in mortality between the Japanese counties in multiple years, according to age. By a Markov chain Monte Carlo (MCMC) computation, results have demonstrated the flexibility and predictive performance of our proposed model.
In spatial statistics, it is often assumed that the spatial field of interest is stationary and its covariance has a simple parametric form, but these assumptions are not appropriate in many applications. Given replicate observations of a Gaussian spatial field, we propose nonstationary and nonparametric Bayesian inference on the spatial dependence. Instead of estimating the quadratic (in the number of spatial locations) entries of the covariance matrix, the idea is to infer a near-linear number of nonzero entries in a sparse Cholesky factor of the precision matrix. Our prior assumptions are motivated by recent results on the exponential decay of the entries of this Cholesky factor for Matern-type covariances under a specific ordering scheme. Our methods are highly scalable and parallelizable. We conduct numerical comparisons and apply our methodology to climate-model output, enabling statistical emulation of an expensive physical model.
Incorporating preclinical animal data, which can be regarded as a special kind of historical data, into phase I clinical trials can improve decision making when very little about human toxicity is known. In this paper, we develop a robust hierarchical modelling approach to leverage animal data into new phase I clinical trials, where we bridge across non-overlapping, potentially heterogeneous patient subgroups. Translation parameters are used to bring both historical and contemporary data onto a common dosing scale. This leads to feasible exchangeability assumptions that the parameter vectors, which underpin the dose-toxicity relationship per study, are assumed to be drawn from a common distribution. Moreover, human dose-toxicity parameter vectors are assumed to be exchangeable either with the standardised, animal study-specific parameter vectors, or between themselves. Possibility of non-exchangeability for each parameter vector is considered to avoid inferences for extreme subgroups being overly influenced by the other. We illustrate the proposed approach with several trial data examples, and evaluate the operating characteristics of our model compared with several alternatives in a simulation study. Numerical results show that our approach yields robust inferences in circumstances, where data from multiple sources are inconsistent and/or the bridging assumptions are incorrect.
In employing spatial regression models for counts, we usually meet two issues. First, ignoring the inherent collinearity between covariates and the spatial effect would lead to causal inferences. Second, real count data usually reveal over or under-dispersion where the classical Poisson model is not appropriate to use. We propose a flexible Bayesian hierarchical modeling approach by joining non-confounding spatial methodology and a newly reconsidered dispersed count modeling from the renewal theory to control the issues. Specifically, we extend the methodology for analyzing spatial count data based on the gamma distribution assumption for waiting times. The model can be formulated as a latent Gaussian model, and consequently, we can carry out the fast computation using the integrated nested Laplace approximation method. We also examine different popular approaches for handling spatial confounding and compare their performances in the presence of dispersion. We use the proposed methodology to analyze a clinical dataset related to stomach cancer incidence in Slovenia and perform a simulation study to understand the proposed approachs merits better.
Motivated by an example from remote sensing of gas emission sources, we derive two novel change point procedures for multivariate time series where, in contrast to classical change point literature, the changes are not required to be aligned in the different components of the time series. Instead the change points are described by a functional relationship where the precise shape depends on unknown parameters of interest such as the source of the gas emission in the above example. Two different types of tests and the corresponding estimators for the unknown parameters describing the change locations are proposed. We derive the null asymptotics for both tests under weak assumptions on the error time series and show asymptotic consistency under alternatives. Furthermore, we prove consistency for the corresponding estimators of the parameters of interest. The small sample behavior of the methodology is assessed by means of a simulation study and the above remote sensing example analyzed in detail.