ترغب بنشر مسار تعليمي؟ اضغط هنا

A composite neural network that learns from multi-fidelity data: Application to function approximation and inverse PDE problems

238   0   0.0 ( 0 )
 نشر من قبل Xuhui Meng
 تاريخ النشر 2019
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

We propose a new composite neural network (NN) that can be trained based on multi-fidelity data. It is comprised of three NNs, with the first NN trained using the low-fidelity data and coupled to two high-fidelity NNs, one with activation functions and another one without, in order to discover and exploit nonlinear and linear correlations, respectively, between the low-fidelity and the high-fidelity data. We first demonstrate the accuracy of the new multi-fidelity NN for approximating some standard benchmark functions but also a 20-dimensional function. Subsequently, we extend the recently developed physics-informed neural networks (PINNs) to be trained with multi-fidelity data sets (MPINNs). MPINNs contain four fully-connected neural networks, where the first one approximates the low-fidelity data, while the second and third construct the correlation between the low- and high-fidelity data and produce the multi-fidelity approximation, which is then used in the last NN that encodes the partial differential equations (PDEs). Specifically, in the two high-fidelity NNs a relaxation parameter is introduced, which can be optimized to combine the linear and nonlinear sub-networks. By optimizing this parameter, the present model is capable of learning both the linear and complex nonlinear correlations between the low- and high-fidelity data adaptively. By training the MPINNs, we can:(1) obtain the correlation between the low- and high-fidelity data, (2) infer the quantities of interest based on a few scattered data, and (3) identify the unknown parameters in the PDEs. In particular, we employ the MPINNs to learn the hydraulic conductivity field for unsaturated flows as well as the reactive models for reactive transport. The results demonstrate that MPINNs can achieve relatively high accuracy based on a very small set of high-fidelity data.

قيم البحث

اقرأ أيضاً

75 - Veit Elser 2016
We study neural networks whose only non-linear components are multipliers, to test a new training rule in a context where the precise representation of data is paramount. These networks are challenged to discover the rules of matrix multiplication, g iven many examples. By limiting the number of multipliers, the network is forced to discover the Strassen multiplication rules. This is the mathematical equivalent of finding low rank decompositions of the $ntimes n$ matrix multiplication tensor, $M_n$. We train these networks with the conservative learning rule, which makes minimal changes to the weights so as to give the correct output for each input at the time the input-output pair is received. Conservative learning needs a few thousand examples to find the rank 7 decomposition of $M_2$, and $10^5$ for the rank 23 decomposition of $M_3$ (the lowest known). High precision is critical, especially for $M_3$, to discriminate between true decompositions and border approximations.
This work investigates the framework and performance issues of the composite neural network, which is composed of a collection of pre-trained and non-instantiated neural network models connected as a rooted directed acyclic graph for solving complica ted applications. A pre-trained neural network model is generally well trained, targeted to approximate a specific function. Despite a general belief that a composite neural network may perform better than a single component, the overall performance characteristics are not clear. In this work, we construct the framework of a composite network, and prove that a composite neural network performs better than any of its pre-trained components with a high probability bound. In addition, if an extra pre-trained component is added to a composite network, with high probability, the overall performance will not be degraded. In the study, we explore a complicated application -- PM2.5 prediction -- to illustrate the correctness of the proposed composite network theory. In the empirical evaluations of PM2.5 prediction, the constructed composite neural network models support the proposed theory and perform better than other machine learning models, demonstrate the advantages of the proposed framework.
We propose a Bayesian physics-informed neural network (B-PINN) to solve both forward and inverse nonlinear problems described by partial differential equations (PDEs) and noisy data. In this Bayesian framework, the Bayesian neural network (BNN) combi ned with a PINN for PDEs serves as the prior while the Hamiltonian Monte Carlo (HMC) or the variational inference (VI) could serve as an estimator of the posterior. B-PINNs make use of both physical laws and scattered noisy measurements to provide predictions and quantify the aleatoric uncertainty arising from the noisy data in the Bayesian framework. Compared with PINNs, in addition to uncertainty quantification, B-PINNs obtain more accurate predictions in scenarios with large noise due to their capability of avoiding overfitting. We conduct a systematic comparison between the two different approaches for the B-PINN posterior estimation (i.e., HMC or VI), along with dropout used for quantifying uncertainty in deep neural networks. Our experiments show that HMC is more suitable than VI for the B-PINNs posterior estimation, while dropout employed in PINNs can hardly provide accurate predictions with reasonable uncertainty. Finally, we replace the BNN in the prior with a truncated Karhunen-Lo`eve (KL) expansion combined with HMC or a deep normalizing flow (DNF) model as posterior estimators. The KL is as accurate as BNN and much faster but this framework cannot be easily extended to high-dimensional problems unlike the BNN based framework.
We present a scale-bridging approach based on a multi-fidelity (MF) machine-learning (ML) framework leveraging Gaussian processes (GP) to fuse atomistic computational model predictions across multiple levels of fidelity. Through the posterior varianc e of the MFGP, our framework naturally enables uncertainty quantification, providing estimates of confidence in the predictions. We used Density Functional Theory as high-fidelity prediction, while a ML interatomic potential is used as the low-fidelity prediction. Practical materials design efficiency is demonstrated by reproducing the ternary composition dependence of a quantity of interest (bulk modulus) across the full aluminum-niobium-titanium ternary random alloy composition space. The MFGP is then coupled to a Bayesian optimization procedure and the computational efficiency of this approach is demonstrated by performing an on-the-fly search for the global optimum of bulk modulus in the ternary composition space. The framework presented in this manuscript is the first application of MFGP to atomistic materials simulations fusing predictions between Density Functional Theory and classical interatomic potential calculations.
The linear micro-instabilities driving turbulent transport in magnetized fusion plasmas (as well as the respective nonlinear saturation mechanisms) are known to be sensitive with respect to various physical parameters characterizing the background pl asma and the magnetic equilibrium. Therefore, uncertainty quantification is essential for achieving predictive numerical simulations of plasma turbulence. However, the high computational costs of the required gyrokinetic simulations and the large number of parameters render standard Monte Carlo techniques intractable. To address this problem, we propose a multi-fidelity Monte Carlo approach in which we employ data-driven low-fidelity models that exploit the structure of the underlying problem such as low intrinsic dimension and anisotropic coupling of the stochastic inputs. The low-fidelity models are efficiently constructed via sensitivity-driven dimension-adaptive sparse grid interpolation using both the full set of uncertain inputs and subsets comprising only selected, important parameters. We illustrate the power of this method by applying it to two plasma turbulence problems with up to $14$ stochastic parameters, demonstrating that it is up to four orders of magnitude more efficient than standard Monte Carlo methods measured in single-core performance, which translates into a runtime reduction from around eight days to one hour on 240 cores on parallel machines.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا