This article investigates the origin of numerical issues in maximum likelihood parameter estimation for Gaussian process (GP) interpolation and investigates simple but effective strategies for improving commonly used open-source software implementations. This work targets a basic problem but a host of studies, particularly in the literature of Bayesian optimization, rely on off-the-shelf GP implementations. For the conclusions of these studies to be reliable and reproducible, robust GP implementations are critical.
Mixture of Experts (MoE) are successful models for modeling heterogeneous data in many statistical learning problems including regression, clustering and classification. Generally fitted by maximum likelihood estimation via the well-known EM algorithm, their application to high-dimensional problems is still therefore challenging. We consider the problem of fitting and feature selection in MoE models, and propose a regularized maximum likelihood estimation approach that encourages sparse solutions for heterogeneous regression data models with potentially high-dimensional predictors. Unlike state-of-the art regularized MLE for MoE, the proposed modelings do not require an approximate of the penalty function. We develop two hybrid EM algorithms: an Expectation-Majorization-Maximization (EM/MM) algorithm, and an EM algorithm with coordinate ascent algorithm. The proposed algorithms allow to automatically obtaining sparse solutions without thresholding, and avoid matrix inversion by allowing univariate parameter updates. An experimental study shows the good performance of the algorithms in terms of recovering the actual sparse solutions, parameter estimation, and clustering of heterogeneous regression data.
We consider the problem of estimating parameters of stochastic differential equations (SDEs) with discrete-time observations that are either completely or partially observed. The transition density between two observations is generally unknown. We propose an importance sampling approach with an auxiliary parameter when the transition density is unknown. We embed the auxiliary importance sampler in a penalized maximum likelihood framework which produces more accurate and computationally efficient parameter estimates. Simulation studies in three different models illustrate promising improvements of the new penalized simulated maximum likelihood method. The new procedure is designed for the challenging case when some state variables are unobserved and moreover, observed states are sparse over time, which commonly arises in ecological studies. We apply this new approach to two epidemics of chronic wasting disease in mule deer.
Gaussian process (GP) predictors are an important component of many Bayesian approaches to machine learning. However, even a straightforward implementation of Gaussian process regression (GPR) requires O(n^2) space and O(n^3) time for a dataset of n examples. Several approximation methods have been proposed, but there is a lack of understanding of the relative merits of the different approximations, and in what situations they are most useful. We recommend assessing the quality of the predictions obtained as a function of the compute time taken, and comparing to standard baselines (e.g., Subset of Data and FITC). We empirically investigate four different approximation algorithms on four different prediction problems, and make our code available to encourage future comparisons.
Gaussian processes are the gold standard for many real-world modeling problems, especially in cases where a models success hinges upon its ability to faithfully represent predictive uncertainty. These problems typically exist as parts of larger frameworks, wherein quantities of interest are ultimately defined by integrating over posterior distributions. These quantities are frequently intractable, motivating the use of Monte Carlo methods. Despite substantial progress in scaling up Gaussian processes to large training sets, methods for accurately generating draws from their posterior distributions still scale cubically in the number of test locations. We identify a decomposition of Gaussian processes that naturally lends itself to scalable sampling by separating out the prior from the data. Building off of this factorization, we propose an easy-to-use and general-purpose approach for fast posterior sampling, which seamlessly pairs with sparse approximations to afford scalability both during training and at test time. In a series of experiments designed to test competing sampling schemes statistical properties and practical ramifications, we demonstrate how decoupled sample paths accurately represent Gaussian process posteriors at a fraction of the usual cost.
Ordinary differential equations (ODEs), commonly used to characterize the dynamic systems, are difficult to propose in closed-form for many complicated scientific applications, even with the help of domain expert. We propose a fast and accurate data-driven method, MAGI-X, to learn the unknown dynamic from the observation data in a non-parametric fashion, without the need of any domain knowledge. Unlike the existing methods that mainly rely on the costly numerical integration, MAGI-X utilizes the powerful functional approximator of neural network to learn the unknown nonlinear dynamic within the MAnifold-constrained Gaussian process Inference (MAGI) framework that completely circumvents the numerical integration. Comparing against the state-of-the-art methods on three realistic examples, MAGI-X achieves competitive accuracy in both fitting and forecasting while only taking a fraction of computational time. Moreover, MAGI-X provides practical solution for the inference of partial observed systems, which no previous method is able to handle.
Subhasish Basak
,Sebastien Petit
,Julien Bect
.
(2021)
.
"Numerical issues in maximum likelihood parameter estimation for Gaussian process interpolation"
.
Emmanuel Vazquez
هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا