No Arabic abstract
We first introduce a novel profile-based alignment algorithm, the multiple continuous Signal Alignment algorithm with Gaussian Process Regression profiles (SA-GPR). SA-GPR addresses the limitations of currently available signal alignment methods by adopting a hybrid of the particle smoothing and Markov-chain Monte Carlo (MCMC) algorithms to align signals, and by applying the Gaussian process regression to construct profiles to be aligned continuously. SA-GPR shares all the strengths of the existing alignment algorithms that depend on profiles but is more exact in the sense that profiles do not need to be discretized as sequential bins. The uncertainty of performance over the resolution of such bins is thereby eliminated. This methodology produces alignments that are consistent, that regularize extreme cases, and that properly reflect the inherent uncertainty. Then we extend SA-GPR to a specific problem in the field of paleoceanography with a method called Bayesian Inference Gaussian Process Multiproxy Alignment of Continuous Signals (BIGMACS). The goal of BIGMACS is to infer continuous ages for ocean sediment cores using two classes of age proxies: proxies that explicitly return calendar ages (e.g., radiocarbon) and those used to synchronize ages in multiple marine records (e.g., an oxygen isotope based marine proxy known as benthic ${delta}^{18}{rm O}$). BIGMACS integrates these two proxies by iteratively performing two steps: profile construction from benthic ${delta}^{18}{rm O}$ age models and alignment of each core to the profile also reflecting radiocarbon dates. We use BIGMACS to construct a new Deep Northeastern Atlantic stack (i.e., a profile from a particular benthic ${delta}^{18}{rm O}$ records) of five ocean sediment cores. We conclude by constructing multiproxy age models for two additional cores from the same region by aligning them to the stack.
Learning in Gaussian Process models occurs through the adaptation of hyperparameters of the mean and the covariance function. The classical approach entails maximizing the marginal likelihood yielding fixed point estimates (an approach called textit{Type II maximum likelihood} or ML-II). An alternative learning procedure is to infer the posterior over hyperparameters in a hierarchical specification of GPs we call textit{Fully Bayesian Gaussian Process Regression} (GPR). This work considers two approximation schemes for the intractable hyperparameter posterior: 1) Hamiltonian Monte Carlo (HMC) yielding a sampling-based approximation and 2) Variational Inference (VI) where the posterior over hyperparameters is approximated by a factorized Gaussian (mean-field) or a full-rank Gaussian accounting for correlations between hyperparameters. We analyze the predictive performance for fully Bayesian GPR on a range of benchmark data sets.
Gaussian processes (GPs) are a well-known nonparametric Bayesian inference technique, but they suffer from scalability problems for large sample sizes, and their performance can degrade for non-stationary or spatially heterogeneous data. In this work, we seek to overcome these issues through (i) employing variational free energy approximations of GPs operating in tandem with online expectation propagation steps; and (ii) introducing a local splitting step which instantiates a new GP whenever the posterior distribution changes significantly as quantified by the Wasserstein metric over posterior distributions. Over time, then, this yields an ensemble of sparse GPs which may be updated incrementally, and adapts to locality, heterogeneity, and non-stationarity in training data.
This paper presents a Gaussian process (GP) model for estimating piecewise continuous regression functions. In scientific and engineering applications of regression analysis, the underlying regression functions are piecewise continuous in that data follow different continuous regression models for different regions of the data with possible discontinuities between the regions. However, many conventional GP regression approaches are not designed for piecewise regression analysis. We propose a new GP modeling approach for estimating an unknown piecewise continuous regression function. The new GP model seeks for a local GP estimate of an unknown regression function at each test location, using local data neighboring to the test location. To accommodate the possibilities of the local data from different regions, the local data is partitioned into two sides by a local linear boundary, and only the local data belonging to the same side as the test location is used for the regression estimate. This local split works very well when the input regions are bounded by smooth boundaries, so the local linear approximation of the smooth boundaries works well. We estimate the local linear boundary jointly with the other hyperparameters of the GP model, using the maximum likelihood approach. Its computation time is as low as the local GPs time. The superior numerical performance of the proposed approach over the conventional GP modeling approaches is shown using various simulated piecewise regression functions.
We present a model that can automatically learn alignments between high-dimensional data in an unsupervised manner. Our proposed method casts alignment learning in a framework where both alignment and data are modelled simultaneously. Further, we automatically infer groupings of different types of sequences within the same dataset. We derive a probabilistic model built on non-parametric priors that allows for flexible warps while at the same time providing means to specify interpretable constraints. We demonstrate the efficacy of our approach with superior quantitative performance to the state-of-the-art approaches and provide examples to illustrate the versatility of our model in automatic inference of sequence groupings, absent from previous approaches, as well as easy specification of high level priors for different modalities of data.
1 Sharp prediction of extinction times is needed in biodiversity monitoring and conservation management. 2 The Galton-Watson process is a classical stochastic model for describing population dynamics. Its evolution is like the matrix population model where offspring numbers are random. Extinction probability, extinction time, abundance are well known and given by explicit formulas. In contrast with the deterministic model, it can be applied to small populations. 3 Parameters of this model can be estimated through the Bayesian inference framework. This enables to consider non-arbitrary scenarios. 4 We show how coupling Bayesian inference with the Galton-Watson model provides several features: i) a flexible modelling approach with easily understandable parameters ii) compatibility with the classical matrix population model (Leslie type model) iii) A non-computational approach which then leads to more information with less computing iv) a non-arbitrary choice for scenarios, parameters... It can be seen to go one step further than the classical matrix population model for the viability problem. 5 To illustrate these features, we provide analysis details for two examples whose one of which is a real life example.