No Arabic abstract
A systematic approach based on the principles of supervised learning and design of experiments concepts is introduced to build a surrogate model for estimating the optical properties of fractal aggregates. The surrogate model is built on Gaussian process (GP) regression, and the input points for the GP regression are sampled with an adaptive sequential design algorithm. The covariance functions used are the squared exponential covariance function and the Matern covariance function both with Automatic Relevance Determination (ARD). The optical property considered is extinction efficiency of soot aggregates. The strengths and weaknesses of the proposed methodology are first tested with RDG-FA. Then, surrogate models are developed for the sampled points, for which the extinction efficiency is calculated by DDA. Four different uniformly gridded databases are also constructed for comparison. It is observed that the estimations based on the surrogate model designed with Matern covariance functions is superior to the estimations based on databases in terms of the accuracy of the estimations and the total number of input points they require. Finally, a preliminary surrogate model for S 11 is built to correct RDG-FA predictions with the aim of combining the speed of RDG-FA with the accuracy of DDA.
This paper presents a new Gaussian process (GP) surrogate modeling for predicting the outcome of a physical experiment where some experimental inputs are controlled by other manipulating factors. Particularly, we are interested in the case where the control precision is not very high, so the input factor values vary significantly even under the same setting of the corresponding manipulating factors. The case is observed in our main application to carbon nanotube growth experiments, where one experimental input among many is manipulated by another manipulating factors, and the relation between the input and the manipulating factors significantly varies in the dates and times of operations. Due to this variation, the standard GP surrogate that directly relates the manipulating factors to the experimental outcome does not provide a great predictive power on the outcome. At the same time, the GP model relating the main factors to the outcome directly is not appropriate for the prediction purpose because the main factors cannot be accurately set as planned for a future experiment. Motivated by the carbon nanotube example, we propose a two-tiered GP model, where the bottom tier relates the manipulating factors to the corresponding main factors with potential biases and variation independent of the manipulating factors, and the top tier relates the main factors to the experimental outcome. Our two-tier model explicitly models the propagation of the control uncertainty to the experimental outcome through the two GP modeling tiers. We present the inference and hyper-parameter estimation of the proposed model. The proposed approach is illustrated with the motivating example of a closed-loop autonomous research system for carbon nanotube growth experiments, and the test results are reported with the comparison to a benchmark method, i.e. a standard GP model.
The boundary problem of linear classical optics about the interaction of electromagnetic radiation with a thin dielectric film has been solved under explicit consideration of its discrete structure. The main attention has been paid to the investigation of the near-zone optical response of dielectrics. The laws of reflection and refraction for discrete structures in the case of a regular atomic distribution are studied and the structure of evanescent harmonics induced by an external plane wave near the surface is investigated in details. It is shown by means of analytical and numerical calculations that due to the existence of the evanescent harmonics the laws of reflection and refraction at the distances from the surface less than two interatomic distances are principally different from the Fresnel laws. From the practical point of view the results of this work might be useful for the near-field optical microscopy of ultrahigh resolution.
This article presents an original methodology for the prediction of steady turbulent aerodynamic fields. Due to the important computational cost of high-fidelity aerodynamic simulations, a surrogate model is employed to cope with the significant variations of several inflow conditions. Specifically, the Local Decomposition Method presented in this paper has been derived to capture nonlinear behaviors resulting from the presence of continuous and discontinuous signals. A combination of unsupervised and supervised learning algorithms is coupled with a physical criterion. It decomposes automatically the input parameter space, from a limited number of high-fidelity simulations, into subspaces. These latter correspond to different flow regimes. A measure of entropy identifies the subspace with the expected strongest non-linear behavior allowing to perform an active resampling on this low-dimensional structure. Local reduced-order models are built on each subspace using Proper Orthogonal Decomposition coupled with a multivariate interpolation tool. The methodology is assessed on the turbulent two-dimensional flow around the RAE2822 transonic airfoil. It exhibits a significant improvement in term of prediction accuracy for the Local Decomposition Method compared with the classical method of surrogate modeling for cases with different flow regimes.
Recent works have explored the potential of machine learning as data-driven turbulence closures for RANS and LES techniques. Beyond these advances, the high expressivity and agility of physics-informed neural networks (PINNs) make them promising candidates for full fluid flow PDE modeling. An important question is whether this new paradigm, exempt from the traditional notion of discretization of the underlying operators very much connected to the flow scales resolution, is capable of sustaining high levels of turbulence characterized by multi-scale features? We investigate the use of PINNs surrogate modeling for turbulent Rayleigh-B{e}nard (RB) convection flows in rough and smooth rectangular cavities, mainly relying on DNS temperature data from the fluid bulk. We carefully quantify the computational requirements under which the formulation is capable of accurately recovering the flow hidden quantities. We then propose a new padding technique to distribute some of the scattered coordinates-at which PDE residuals are minimized-around the region of labeled data acquisition. We show how it comes to play as a regularization close to the training boundaries which are zones of poor accuracy for standard PINNs and results in a noticeable global accuracy improvement at iso-budget. Finally, we propose for the first time to relax the incompressibility condition in such a way that it drastically benefits the optimization search and results in a much improved convergence of the composite loss function. The RB results obtained at high Rayleigh number Ra = 2 $bullet$ 10 9 are particularly impressive: the predictive accuracy of the surrogate over the entire half a billion DNS coordinates yields errors for all flow variables ranging between [0.3% -- 4%] in the relative L 2 norm, with a training relying only on 1.6% of the DNS data points.
Gaussian process (GP) regression in large-data contexts, which often arises in surrogate modeling of stochastic simulation experiments, is challenged by cubic runtimes. Coping with input-dependent noise in that setting is doubly so. Recent advances target reduced computational complexity through local approximation (e.g., LAGP) or otherwise induced sparsity. Yet these do not economically accommodate a common design feature when attempting to separate signal from noise. Replication can offer both statistical and computational efficiencies, motivating several extensions to the local surrogate modeling toolkit. Introducing a nugget into a local kernel structure is just the first step. We argue that a new inducing point formulation (LIGP), already preferred over LAGP on the speed-vs-accuracy frontier, conveys additional advantages when replicates are involved. Woodbury identities allow local kernel structure to be expressed in terms of unique design locations only, increasing the amount of data (i.e., the neighborhood size) that may be leveraged without additional flops. We demonstrate that this upgraded LIGP provides more accurate prediction and uncertainty quantification compared to several modern alternatives. Illustrations are provided on benchmark data, real-world simulation experiments on epidemic management and ocean oxygen concentration, and in an options pricing control framework.