New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Group kernels for Gaussian process metamodels with categorical inputs

114 0 0.0 ( 0 )

Download Cite

Added by Olivier Roustant

Publication date 2018

fields Mathematical Statistics

and research's language is English

Authors Olivier Roustant (GdR MASCOT-NUM - LIMOS - FAYOL-ENSMSE

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Gaussian processes (GP) are widely used as a metamodel for emulating time-consuming computer codes. We focus on problems involving categorical inputs, with a potentially large number L of levels (typically several tens), partitioned in G << L groups of various sizes. Parsimonious covariance functions, or kernels, can then be defined by block covariance matrices T with constant covariances between pairs of blocks and within blocks. We study the positive definiteness of such matrices to encourage their practical use. The hierarchical group/level structure, equivalent to a nested Bayesian linear model, provides a parameterization of valid block matrices T. The same model can then be used when the assumption within blocks is relaxed, giving a flexible parametric family of valid covariance matrices with constant covariances between pairs of blocks. The positive definiteness of T is equivalent to the positive definiteness of a smaller matrix of size G, obtained by averaging each block. The model is applied to a problem in nuclear waste analysis, where one of the categorical inputs is atomic number, which has more than 90 levels.

rate research

Identifiability of Covariance Kernels in the Gaussian Process Regression Model

142 - JaeHoan Kim , Jaeyong Lee 2021

Gaussian process regression (GPR) model is a popular nonparametric regression model. In GPR, features of the regression function such as varying degrees of smoothness and periodicities are modeled through combining various covarinace kernels, which are supposed to model certain effects. The covariance kernels have unknown parameters which are estimated by the EM-algorithm or Markov Chain Monte Carlo. The estimated parameters are keys to the inference of the features of the regression functions, but identifiability of these parameters has not been investigated. In this paper, we prove identifiability of covariance kernel parameters in two radial basis mixed kernel GPR and radial basis and periodic mixed kernel GPR. We also provide some examples about non-identifiable cases in such mixed kernel GPRs.

Statistics Theory Methodology Statistics Theory

Gaussian process regression for survival data with competing risks

226 - James E. Barrett , Anthony C. C. Coolen 2013

We apply Gaussian process (GP) regression, which provides a powerful non-parametric probabilistic method of relating inputs to outputs, to survival data consisting of time-to-event and covariate measurements. In this context, the covariates are regarded as the `inputs and the event times are the `outputs. This allows for highly flexible inference of non-linear relationships between covariates and event times. Many existing methods, such as the ubiquitous Cox proportional hazards model, focus primarily on the hazard rate which is typically assumed to take some parametric or semi-parametric form. Our proposed model belongs to the class of accelerated failure time models where we focus on directly characterising the relationship between covariates and event times without any explicit assumptions on what form the hazard rates take. It is straightforward to include various types and combinations of censored and truncated observations. We apply our approach to both simulated and experimental data. We then apply multiple output GP regression, which can handle multiple potentially correlated outputs for each input, to competing risks survival data where multiple event types can occur. By tuning one of the model parameters we can control the extent to which the multiple outputs (the time-to-event for each risk) are dependent thus allowing the specification of correlated risks. Simulation studies suggest that in some cases assuming dependence can lead to more accurate predictions.

Statistics Theory Methodology Statistics Theory

Covariate dimension reduction for survival data via the Gaussian process latent variable model

193 - James E. Barrett , Anthony C. C. Coolen 2014

The analysis of high dimensional survival data is challenging, primarily due to the problem of overfitting which occurs when spurious relationships are inferred from data that subsequently fail to exist in test data. Here we propose a novel method of extracting a low dimensional representation of covariates in survival data by combining the popular Gaussian Process Latent Variable Model (GPLVM) with a Weibull Proportional Hazards Model (WPHM). The combined model offers a flexible non-linear probabilistic method of detecting and extracting any intrinsic low dimensional structure from high dimensional data. By reducing the covariate dimension we aim to diminish the risk of overfitting and increase the robustness and accuracy with which we infer relationships between covariates and survival outcomes. In addition, we can simultaneously combine information from multiple data sources by expressing multiple datasets in terms of the same low dimensional space. We present results from several simulation studies that illustrate a reduction in overfitting and an increase in predictive performance, as well as successful detection of intrinsic dimensionality. We provide evidence that it is advantageous to combine dimensionality reduction with survival outcomes rather than performing unsupervised dimensionality reduction on its own. Finally, we use our model to analyse experimental gene expression data and detect and extract a low dimensional representation that allows us to distinguish high and low risk groups with superior accuracy compared to doing regression on the original high dimensional data.

Statistics Theory Methodology Statistics Theory

Detecting sparse cone alternatives for Gaussian random fields, with an application to fMRI

133 - Jonathan E. Taylor , Keith J. Worsley 2012

Our problem is to find a good approximation to the P-value of the maximum of a random field of test statistics for a cone alternative at each point in a sample of Gaussian random fields. These test statistics have been proposed in the neuroscience literature for the analysis of fMRI data allowing for unknown delay in the hemodynamic response. However the null distribution of the maximum of this 3D random field of test statistics, and hence the threshold used to detect brain activation, was unsolved. To find a solution, we approximate the P-value by the expected Euler characteristic (EC) of the excursion set of the test statistic random field. Our main result is the required EC density, derived using the Gaussian Kinematic Formula.

Statistics Theory Methodology Statistics Theory

Stochastic simulators based optimization by Gaussian process metamodels -- Application to maintenance investments planning issues

58 - Thomas Browne 2015

This paper deals with the optimization of industrial asset management strategies, whose profitability is characterized by the Net Present Value (NPV) indicator which is assessed by a Monte Carlo simulator. The developed method consists in building a metamodel of this stochastic simulator, allowing to get, for a given model input, the NPV probability distribution without running the simulator. The present work is concentrated on the emulation of the quantile function of the stochastic simulator by interpolating well chosen basis functions and metamodeling their coefficients (using the Gaussian process metamodel). This quantile function metamodel is then used to treat a problem of strategy maintenance optimization (four systems installed on different plants), in order to optimize an NPV quantile. Using the Gaussian process framework, an adaptive design method (called QFEI) is defined by extending in our case the well known EGO algorithm. This allows to obtain an optimal solution using a small number of simulator runs.

Methodology Statistics Theory Statistics Theory

comments

Fetching comments

Syrian International University for Science and Technology

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Group kernels for Gaussian process metamodels with categorical inputs

Ask ChatGPT about the research

No Arabic abstract

Read More