Large-scale local surrogate modeling of stochastic simulation experiments

77 0 0.0 ( 0 )

Download Cite

Added by Austin Cole

Publication date 2021

fields Mathematical Statistics

and research's language is English

Authors D Austin Cole - Robert B Gramacy - Mike Ludkovski

Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Gaussian process (GP) regression in large-data contexts, which often arises in surrogate modeling of stochastic simulation experiments, is challenged by cubic runtimes. Coping with input-dependent noise in that setting is doubly so. Recent advances target reduced computational complexity through local approximation (e.g., LAGP) or otherwise induced sparsity. Yet these do not economically accommodate a common design feature when attempting to separate signal from noise. Replication can offer both statistical and computational efficiencies, motivating several extensions to the local surrogate modeling toolkit. Introducing a nugget into a local kernel structure is just the first step. We argue that a new inducing point formulation (LIGP), already preferred over LAGP on the speed-vs-accuracy frontier, conveys additional advantages when replicates are involved. Woodbury identities allow local kernel structure to be expressed in terms of unique design locations only, increasing the amount of data (i.e., the neighborhood size) that may be leveraged without additional flops. We demonstrate that this upgraded LIGP provides more accurate prediction and uncertainty quantification compared to several modern alternatives. Illustrations are provided on benchmark data, real-world simulation experiments on epidemic management and ocean oxygen concentration, and in an options pricing control framework.

rate research

Gaussian process surrogate modeling with manipulating factors for carbon nanotube growth experiments

97 - Chiwoo Park , Rahul Rao , Pavel Nikolaev 2019

This paper presents a new Gaussian process (GP) surrogate modeling for predicting the outcome of a physical experiment where some experimental inputs are controlled by other manipulating factors. Particularly, we are interested in the case where the control precision is not very high, so the input factor values vary significantly even under the same setting of the corresponding manipulating factors. The case is observed in our main application to carbon nanotube growth experiments, where one experimental input among many is manipulated by another manipulating factors, and the relation between the input and the manipulating factors significantly varies in the dates and times of operations. Due to this variation, the standard GP surrogate that directly relates the manipulating factors to the experimental outcome does not provide a great predictive power on the outcome. At the same time, the GP model relating the main factors to the outcome directly is not appropriate for the prediction purpose because the main factors cannot be accurately set as planned for a future experiment. Motivated by the carbon nanotube example, we propose a two-tiered GP model, where the bottom tier relates the manipulating factors to the corresponding main factors with potential biases and variation independent of the manipulating factors, and the top tier relates the main factors to the experimental outcome. Our two-tier model explicitly models the propagation of the control uncertainty to the experimental outcome through the two GP modeling tiers. We present the inference and hyper-parameter estimation of the proposed model. The proposed approach is illustrated with the motivating example of a closed-loop autonomous research system for carbon nanotube growth experiments, and the test results are reported with the comparison to a benchmark method, i.e. a standard GP model.

Methodology

Parameter and Uncertainty Estimation for Dynamical Systems Using Surrogate Stochastic Processes

77 - M. Chung , M. Binois , R.B. Gramacy 2018

Inference on unknown quantities in dynamical systems via observational data is essential for providing meaningful insight, furnishing accurate predictions, enabling robust control, and establishing appropriate designs for future experiments. Merging mathematical theory with empirical measurements in a statistically coherent way is critical and challenges abound, e.g.,: ill-posedness of the parameter estimation problem, proper regularization and incorporation of prior knowledge, and computational limitations on full uncertainty qualification. To address these issues, we propose a new method for learning parameterized dynamical systems from data. In many ways, our proposal turns the canonical framework on its head. We first fit a surrogate stochastic process to observational data, enforcing prior knowledge (e.g., smoothness), and coping with challenging data features like heteroskedasticity, heavy tails and censoring. Then, samples of the stochastic process are used as surrogate data and point estimates are computed via ordinary point estimation methods in a modular fashion. An attractive feature of this approach is that it is fully Bayesian and simultaneously parallelizable. We demonstrate the advantages of our new approach on a predator prey simulation study and on a real world application involving within-host influenza virus infection data paired with a viral kinetic model.

Methodology Numerical Analysis

Stochastic Kriging for Inadequate Simulation Models

89 - Lu Zou , Xiaowei Zhang 2018

Stochastic kriging is a popular metamodeling technique for representing the unknown response surface of a simulation model. However, the simulation model may be inadequate in the sense that there may be a non-negligible discrepancy between it and the real system of interest. Failing to account for the model discrepancy may conceivably result in erroneous prediction of the real systems performance and mislead the decision-making process. This paper proposes a metamodel that extends stochastic kriging to incorporate the model discrepancy. Both the simulation outputs and the real data are used to characterize the model discrepancy. The proposed metamodel can provably enhance the prediction of the real systems performance. We derive general results for experiment design and analysis, and demonstrate the advantage of the proposed metamodel relative to competing methods. Finally, we study the effect of Common Random Numbers (CRN). The use of CRN is well known to be detrimental to the prediction accuracy of stochastic kriging in general. By contrast, we show that the effect of CRN in the new context is substantially more complex. The use of CRN can be either detrimental or beneficial depending on the interplay between the magnitude of the observation errors and other parameters involved.

Methodology

Statistical Uncertainty Analysis for Stochastic Simulation

179 - Wei Xie , Barry L. Nelson , Russell R. Barton 2020

When we use simulation to evaluate the performance of a stochastic system, the simulation often contains input distributions estimated from real-world data; therefore, there is both simulation and input uncertainty in the performance estimates. Ignoring either source of uncertainty underestimates the overall statistical error. Simulation uncertainty can be reduced by additional computation (e.g., more replications). Input uncertainty can be reduced by collecting more real-world data, when feasible. This paper proposes an approach to quantify overall statistical uncertainty when the simulation is driven by independent parametric input distributions; specifically, we produce a confidence interval that accounts for both simulation and input uncertainty by using a metamodel-assisted bootstrapping approach. The input uncertainty is measured via bootstrapping, an equation-based stochastic kriging metamodel propagates the input uncertainty to the output mean, and both simulation and metamodel uncertainty are derived using properties of the metamodel. A variance decomposition is proposed to estimate the relative contribution of input to overall uncertainty; this information indicates whether the overall uncertainty can be significantly reduced through additional simulation alone. Asymptotic analysis provides theoretical support for our approach, while an empirical study demonstrates that it has good finite-sample performance.

Methodology

An Networked HIL Simulation System for Modeling Large-scale Power Systems

87 - Fuhong Xie , Catie McEntee , Mingzhi Zhang 2020

This paper presents a network hardware-in-the-loop (HIL) simulation system for modeling large-scale power systems. Researchers have developed many HIL test systems for power systems in recent years. Those test systems can model both microsecond-level dynamic responses of power electronic systems and millisecond-level transients of transmission and distribution grids. By integrating individual HIL test systems into a network of HIL test systems, we can create large-scale power grid digital twins with flexible structures at required modeling resolution that fits for a wide range of system operating conditions. This will not only significantly reduce the need for field tests when developing new technologies but also greatly shorten the model development cycle. In this paper, we present a networked OPAL-RT based HIL test system for developing transmission-distribution coordinative Volt-VAR regulation technologies as an example to illustrate system setups, communication requirements among different HIL simulation systems, and system connection mechanisms. Impacts of communication delays, information exchange cycles, and computing delays are illustrated. Simulation results show that the performance of a networked HIL test system is satisfactory.

Systems and Control Systems and Control