Do you want to publish a course? Click here

Gaussian Process Subspace Regression for Model Reduction

88   0   0.0 ( 0 )
 Added by Ruda Zhang
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

Subspace-valued functions arise in a wide range of problems, including parametric reduced order modeling (PROM). In PROM, each parameter point can be associated with a subspace, which is used for Petrov-Galerkin projections of large system matrices. Previous efforts to approximate such functions use interpolations on manifolds, which can be inaccurate and slow. To tackle this, we propose a novel Bayesian nonparametric model for subspace prediction: the Gaussian Process Subspace regression (GPS) model. This method is extrinsic and intrinsic at the same time: with multivariate Gaussian distributions on the Euclidean space, it induces a joint probability model on the Grassmann manifold, the set of fixed-dimensional subspaces. The GPS adopts a simple yet general correlation structure, and a principled approach for model selection. Its predictive distribution admits an analytical form, which allows for efficient subspace prediction over the parameter space. For PROM, the GPS provides a probabilistic prediction at a new parameter point that retains the accuracy of local reduced models, at a computational complexity that does not depend on system dimension, and thus is suitable for online computation. We give four numerical examples to compare our method to subspace interpolation, as well as two methods that interpolate local reduced models. Overall, GPS is the most data efficient, more computationally efficient than subspace interpolation, and gives smooth predictions with uncertainty quantification.



rate research

Read More

In this paper we introduce a novel model for Gaussian process (GP) regression in the fully Bayesian setting. Motivated by the ideas of sparsification, localization and Bayesian additive modeling, our model is built around a recursive partitioning (RP) scheme. Within each RP partition, a sparse GP (SGP) regression model is fitted. A Bayesian additive framework then combines multiple layers of partitioned SGPs, capturing both global trends and local refinements with efficient computations. The model addresses both the problem of efficiency in fitting a full Gaussian process regression model and the problem of prediction performance associated with a single SGP. Our approach mitigates the issue of pseudo-input selection and avoids the need for complex inter-block correlations in existing methods. The crucial trade-off becomes choosing between many simpler local model components or fewer complex global model components, which the practitioner can sensibly tune. Implementation is via a Metropolis-Hasting Markov chain Monte-Carlo algorithm with Bayesian back-fitting. We compare our model against popular alternatives on simulated and real datasets, and find the performance is competitive, while the fully Bayesian procedure enables the quantification of model uncertainties.
142 - JaeHoan Kim , Jaeyong Lee 2021
Gaussian process regression (GPR) model is a popular nonparametric regression model. In GPR, features of the regression function such as varying degrees of smoothness and periodicities are modeled through combining various covarinace kernels, which are supposed to model certain effects. The covariance kernels have unknown parameters which are estimated by the EM-algorithm or Markov Chain Monte Carlo. The estimated parameters are keys to the inference of the features of the regression functions, but identifiability of these parameters has not been investigated. In this paper, we prove identifiability of covariance kernel parameters in two radial basis mixed kernel GPR and radial basis and periodic mixed kernel GPR. We also provide some examples about non-identifiable cases in such mixed kernel GPRs.
The analysis of high dimensional survival data is challenging, primarily due to the problem of overfitting which occurs when spurious relationships are inferred from data that subsequently fail to exist in test data. Here we propose a novel method of extracting a low dimensional representation of covariates in survival data by combining the popular Gaussian Process Latent Variable Model (GPLVM) with a Weibull Proportional Hazards Model (WPHM). The combined model offers a flexible non-linear probabilistic method of detecting and extracting any intrinsic low dimensional structure from high dimensional data. By reducing the covariate dimension we aim to diminish the risk of overfitting and increase the robustness and accuracy with which we infer relationships between covariates and survival outcomes. In addition, we can simultaneously combine information from multiple data sources by expressing multiple datasets in terms of the same low dimensional space. We present results from several simulation studies that illustrate a reduction in overfitting and an increase in predictive performance, as well as successful detection of intrinsic dimensionality. We provide evidence that it is advantageous to combine dimensionality reduction with survival outcomes rather than performing unsupervised dimensionality reduction on its own. Finally, we use our model to analyse experimental gene expression data and detect and extract a low dimensional representation that allows us to distinguish high and low risk groups with superior accuracy compared to doing regression on the original high dimensional data.
We apply Gaussian process (GP) regression, which provides a powerful non-parametric probabilistic method of relating inputs to outputs, to survival data consisting of time-to-event and covariate measurements. In this context, the covariates are regarded as the `inputs and the event times are the `outputs. This allows for highly flexible inference of non-linear relationships between covariates and event times. Many existing methods, such as the ubiquitous Cox proportional hazards model, focus primarily on the hazard rate which is typically assumed to take some parametric or semi-parametric form. Our proposed model belongs to the class of accelerated failure time models where we focus on directly characterising the relationship between covariates and event times without any explicit assumptions on what form the hazard rates take. It is straightforward to include various types and combinations of censored and truncated observations. We apply our approach to both simulated and experimental data. We then apply multiple output GP regression, which can handle multiple potentially correlated outputs for each input, to competing risks survival data where multiple event types can occur. By tuning one of the model parameters we can control the extent to which the multiple outputs (the time-to-event for each risk) are dependent thus allowing the specification of correlated risks. Simulation studies suggest that in some cases assuming dependence can lead to more accurate predictions.
161 - Wenjia Wang , Bing-Yi Jing 2021
In this work, we investigate Gaussian process regression used to recover a function based on noisy observations. We derive upper and lower error bounds for Gaussian process regression with possibly misspecified correlation functions. The optimal convergence rate can be attained even if the smoothness of the imposed correlation function exceeds that of the true correlation function and the sampling scheme is quasi-uniform. As byproducts, we also obtain convergence rates of kernel ridge regression with misspecified kernel function, where the underlying truth is a deterministic function. The convergence rates of Gaussian process regression and kernel ridge regression are closely connected, which is aligned with the relationship between sample paths of Gaussian process and the corresponding reproducing kernel Hilbert space.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا