ترغب بنشر مسار تعليمي؟ اضغط هنا

High-dimensional simulation optimization is notoriously challenging. We propose a new sampling algorithm that converges to a global optimal solution and suffers minimally from the curse of dimensionality. The algorithm consists of two stages. First, we take samples following a sparse grid experimental design and approximate the response surface via kernel ridge regression with a Brownian field kernel. Second, we follow the expected improvement strategy -- with critical modifications that boost the algorithms sample efficiency -- to iteratively sample from the next level of the sparse grid. Under mild conditions on the smoothness of the response surface and the simulation noise, we establish upper bounds on the convergence rate for both noise-free and noisy simulation samples. These upper bounds deteriorate only slightly in the dimension of the feasible set, and they can be improved if the objective function is known to be of a higher-order smoothness. Extensive numerical experiments demonstrate that the proposed algorithm dramatically outperforms typical alternatives in practice.
This paper develops a frequentist solution to the functional calibration problem, where the value of a calibration parameter in a computer model is allowed to vary with the value of control variables in the physical system. The need of functional cal ibration is motivated by engineering applications where using a constant calibration parameter results in a significant mismatch between outputs from the computer model and the physical experiment. Reproducing kernel Hilbert spaces (RKHS) are used to model the optimal calibration function, defined as the functional relationship between the calibration parameter and control variables that gives the best prediction. This optimal calibration function is estimated through penalized least squares with an RKHS-norm penalty and using physical data. An uncertainty quantification procedure is also developed for such estimates. Theoretical guarantees of the proposed method are provided in terms of prediction consistency and consistency of estimating the optimal calibration function. The proposed method is tested using both real and synthetic data and exhibits more robust performance in prediction and uncertainty quantification than the existing parametric functional calibration method and a state-of-art Bayesian method.
This paper is concerned with a nonparametric regression problem in which the independence assumption of the input variables and the residuals is no longer valid. Using existing model selection methods, like cross validation, the presence of temporal autocorrelation in the input variables and the error terms leads to model overfitting. This phenomenon is referred to as temporal overfitting, which causes loss of performance while predicting responses for a time domain different from the training time domain. We propose a new method to tackle the temporal overfitting problem. Our nonparametric model is partitioned into two parts -- a time-invariant component and a time-varying component, each of which is modeled through a Gaussian process regression. The key in our inference is a thinning-based strategy, an idea borrowed from Markov chain Monte Carlo sampling, to estimate the two components, respectively. Our specific application in this paper targets the power curve modeling in wind energy. In our numerical studies, we compare extensively our proposed method with both existing power curve models and available ideas for handling temporal overfitting. Our approach yields significant improvement in prediction both in and outside the time domain covered by the training data.
We propose a novel GAN framework for non-parametric density estimation with high-dimensional data. This framework is based on a novel density estimator, called the hyperbolic cross density estimator, which enjoys nice convergence properties in the mi xed Sobolev spaces. As modifications of the usual Sobolev spaces, the mixed Sobolev spaces are more suitable for describing high-dimensional density functions. We prove that, unlike other existing approaches, the proposed GAN framework does not suffer the curse of dimensionality and can achieve the optimal convergence rate of $O_p(n^{-1/2})$, with $n$ data points in an arbitrary fixed dimension. We also study the universality of GANs in terms of the existence of ReLU networks which can approximate the density functions in the mixed Sobolev spaces up to any accuracy level.
224 - Gecheng Chen , Rui Tuo 2020
A primary goal of computer experiments is to reconstruct the function given by the computer code via scattered evaluations. Traditional isotropic Gaussian process models suffer from the curse of dimensionality, when the input dimension is high. Gauss ian process models with additive correlation functions are scalable to dimensionality, but they are very restrictive as they only work for additive functions. In this work, we consider a projection pursuit model, in which the nonparametric part is driven by an additive Gaussian process regression. The dimension of the additive function is chosen to be higher than the original input dimension. We show that this dimension expansion can help approximate more complex functions. A gradient descent algorithm is proposed to maximize the likelihood function. Simulation studies show that the proposed method outperforms the traditional Gaussian process models.
This work proposes a nonparametric method to compare the underlying mean functions given two noisy datasets. The motivation for the work stems from an application of comparing wind turbine power curves. Comparing wind turbine data presents new proble ms, namely the need to identify the regions of difference in the input space and to quantify the extent of difference that is statistically significant. Our proposed method, referred to as funGP, estimates the underlying functions for different data samples using Gaussian process models. We build a confidence band using the probability law of the estimated function differences under the null hypothesis. Then, the confidence band is used for the hypothesis test as well as for identifying the regions of difference. This identification of difference regions is a distinct feature, as existing methods tend to conduct an overall hypothesis test stating whether two functions are different. Understanding the difference regions can lead to further practical insights and help devise better control and maintenance strategies for wind turbines. The merit of funGP is demonstrated by using three simulation studies and four real wind turbine datasets.
Despite their success, kernel methods suffer from a massive computational cost in practice. In this paper, in lieu of commonly used kernel expansion with respect to $N$ inputs, we develop a novel optimal design maximizing the entropy among kernel fea tures. This procedure results in a kernel expansion with respect to entropic optimal features (EOF), improving the data representation dramatically due to features dissimilarity. Under mild technical assumptions, our generalization bound shows that with only $O(N^{frac{1}{4}})$ features (disregarding logarithmic factors), we can achieve the optimal statistical accuracy (i.e., $O(1/sqrt{N})$). The salient feature of our design is its sparsity that significantly reduces the time and space cost. Our numerical experiments on benchmark datasets verify the superiority of EOF over the state-of-the-art in kernel approximation.
112 - Rui Tuo , Wenjia Wang 2020
Bayesian optimization is a class of global optimization techniques. It regards the underlying objective function as a realization of a Gaussian process. Although the outputs of Bayesian optimization are random according to the Gaussian process assump tion, quantification of this uncertainty is rarely studied in the literature. In this work, we propose a novel approach to assess the output uncertainty of Bayesian optimization algorithms, in terms of constructing confidence regions of the maximum point or value of the objective function. These regions can be computed efficiently, and their confidence levels are guaranteed by newly developed uniform error bounds for sequential Gaussian process regression. Our theory provides a unified uncertainty quantification framework for all existing sequential sampling policies and stopping criteria.
126 - Rui Tuo , Yan Wang , C. F. Jeff Wu 2020
Kernel ridge regression is an important nonparametric method for estimating smooth functions. We introduce a new set of conditions, under which the actual rates of convergence of the kernel ridge regression estimator under both the L_2 norm and the n orm of the reproducing kernel Hilbert space exceed the standard minimax rates. An application of this theory leads to a new understanding of the Kennedy-OHagan approach for calibrating model parameters of computer simulation. We prove that, under certain conditions, the Kennedy-OHagan calibration estimator with a known covariance function converges to the minimizer of the norm of the residual function in the reproducing kernel Hilbert space.
266 - Yan Wang , Xiaowei Yue , Rui Tuo 2019
Estimation of model parameters of computer simulators, also known as calibration, is an important topic in many engineering applications. In this paper, we consider the calibration of computer model parameters with the help of engineering design know ledge. We introduce the concept of sensible (calibration) variables. Sensible variables are model parameters which are sensitive in the engineering modeling, and whose optimal values differ from the engineering design values.We propose an effective calibration method to identify and adjust the sensible variables with limited physical experimental data. The methodology is applied to a composite fuselage simulation problem.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا