Conditional Deep Inverse Rosenblatt Transports

109 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Tiangang Cui

تاريخ النشر 2021

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Tiangang Cui - Sergey Dolgov - Olivier Zahm

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We present a novel offline-online method to mitigate the computational burden of the characterization of conditional beliefs in statistical learning. In the offline phase, the proposed method learns the joint law of the belief random variables and the observational random variables in the tensor-train (TT) format. In the online phase, it utilizes the resulting order-preserving conditional transport map to issue real-time characterization of the conditional beliefs given new observed information. Compared with the state-of-the-art normalizing flows techniques, the proposed method relies on function approximation and is equipped with thorough performance analysis. This also allows us to further extend the capability of transport maps in challenging problems with high-dimensional observations and high-dimensional belief variables. On the one hand, we present novel heuristics to reorder and/or reparametrize the variables to enhance the approximation power of TT. On the other, we integrate the TT-based transport maps and the parameter reordering/reparametrization into layered compositions to further improve the performance of the resulting transport maps. We demonstrate the efficiency of the proposed method on various statistical learning tasks in ordinary differential equations (ODEs) and partial differential equations (PDEs).

قيم البحث

208 - Ali Siahkoohi , Gabrio Rizzuti , Philipp A. Witte 2020

In inverse problems, we often have access to data consisting of paired samples $(x,y)sim p_{X,Y}(x,y)$ where $y$ are partial observations of a physical system, and $x$ represents the unknowns of the problem. Under these circumstances, we can employ s upervised training to learn a solution $x$ and its uncertainty from the observations $y$. We refer to this problem as the supervised case. However, the data $ysim p_{Y}(y)$ collected at one point could be distributed differently than observations $ysim p_{Y}(y)$, relevant for a current set of problems. In the context of Bayesian inference, we propose a two-step scheme, which makes use of normalizing flows and joint data to train a conditional generator $q_{theta}(x|y)$ to approximate the target posterior density $p_{X|Y}(x|y)$. Additionally, this preliminary phase provides a density function $q_{theta}(x|y)$, which can be recast as a prior for the unsupervised problem, e.g.~when only the observations $ysim p_{Y}(y)$, a likelihood model $y|x$, and a prior on $x$ are known. We then train another invertible generator with output density $q_{phi}(x|y)$ specifically for $y$, allowing us to sample from the posterior $p_{X|Y}(x|y)$. We present some synthetic results that demonstrate considerable training speedup when reusing the pretrained network $q_{theta}(x|y)$ as a warm start or preconditioning for approximating $p_{X|Y}(x|y)$, instead of learning from scratch. This training modality can be interpreted as an instance of transfer learning. This result is particularly relevant for large-scale inverse problems that employ expensive numerical simulations.

التعلم الالي التعلم الآلي الجيوفيزياء

Deep Poisson gamma dynamical systems

102 - Dandan Guo , Bo Chen , Hao Zhang 2018

We develop deep Poisson-gamma dynamical systems (DPGDS) to model sequentially observed multivariate count data, improving previously proposed models by not only mining deep hierarchical latent structure from the data, but also capturing both first-or der and long-range temporal dependencies. Using sophisticated but simple-to-implement data augmentation techniques, we derived closed-form Gibbs sampling update equations by first backward and upward propagating auxiliary latent counts, and then forward and downward sampling latent variables. Moreover, we develop stochastic gradient MCMC inference that is scalable to very long multivariate count time series. Experiments on both synthetic and a variety of real-world data demonstrate that the proposed model not only has excellent predictive performance, but also provides highly interpretable multilayer latent structure to represent hierarchical and temporal information propagation.

التعلم الالي التعلم الآلي حساب

Deep Extreme Value Copulas for Estimation and Sampling

179 - Ali Hasan , Khalil Elkhalil , Joao M. Pereira 2021

We propose a new method for modeling the distribution function of high dimensional extreme value distributions. The Pickands dependence function models the relationship between the covariates in the tails, and we learn this function using a neural ne twork that is designed to satisfy its required properties. Moreover, we present new methods for recovering the spectral representation of extreme distributions and propose a generative model for sampling from extreme copulas. Numerical examples are provided demonstrating the efficacy and promise of our proposed methods.

التعلم الالي التعلم الآلي حساب

Model-Constrained Deep Learning Approaches for Inverse Problems

97 - Hai V. Nguyen , Tan Bui-Thanh 2021

Deep Learning (DL), in particular deep neural networks (DNN), by design is purely data-driven and in general does not require physics. This is the strength of DL but also one of its key limitations when applied to science and engineering problems in which underlying physical properties (such as stability, conservation, and positivity) and desired accuracy need to be achieved. DL methods in their original forms are not capable of respecting the underlying mathematical models or achieving desired accuracy even in big-data regimes. On the other hand, many data-driven science and engineering problems, such as inverse problems, typically have limited experimental or observational data, and DL would overfit the data in this case. Leveraging information encoded in the underlying mathematical models, we argue, not only compensates missing information in low data regimes but also provides opportunities to equip DL methods with the underlying physics and hence obtaining higher accuracy. This short communication introduces several model-constrained DL approaches (including both feed-forward DNN and autoencoders) that are capable of learning not only information hidden in the training data but also in the underlying mathematical models to solve inverse problems. We present and provide intuitions for our formulations for general nonlinear problems. For linear inverse problems and linear networks, the first order optimality conditions show that our model-constrained DL approaches can learn information encoded in the underlying mathematical models, and thus can produce consistent or equivalent inverse solutions, while naive purely data-based counterparts cannot.

التعلم الالي التعلم الآلي التحسين والتحكم

Deep learning for inverse problems with unknown operator

132 - Miguel del Alamo 2021

We consider ill-posed inverse problems where the forward operator $T$ is unknown, and instead we have access to training data consisting of functions $f_i$ and their noisy images $Tf_i$. This is a practically relevant and challenging problem which cu rrent methods are able to solve only under strong assumptions on the training set. Here we propose a new method that requires minimal assumptions on the data, and prove reconstruction rates that depend on the number of training points and the noise level. We show that, in the regime of many training data, the method is minimax optimal. The proposed method employs a type of convolutional neural networks (U-nets) and empirical risk minimization in order to fit the unknown operator. In a nutshell, our approach is based on two ideas: the first is to relate U-nets to multiscale decompositions such as wavelets, thereby linking them to the existing theory, and the second is to use the hierarchical structure of U-nets and the low number of parameters of convolutional neural nets to prove entropy bounds that are practically useful. A significant difference with the existing works on neural networks in nonparametric statistics is that we use them to approximate operators and not functions, which we argue is mathematically more natural and technically more convenient.

التعلم الالي التعلم الآلي نظرية الإحصاء