No Arabic abstract
In several crucial applications, domain knowledge is encoded by a system of ordinary differential equations (ODE), often stemming from underlying physical and biological processes. A motivating example is intensive care unit patients: the dynamics of vital physiological functions, such as the cardiovascular system with its associated variables (heart rate, cardiac contractility and output and vascular resistance) can be approximately described by a known system of ODEs. Typically, some of the ODE variables are directly observed (heart rate and blood pressure for example) while some are unobserved (cardiac contractility, output and vascular resistance), and in addition many other variables are observed but not modeled by the ODE, for example body temperature. Importantly, the unobserved ODE variables are known-unknowns: We know they exist and their functional dynamics, but cannot measure them directly, nor do we know the function tying them to all observed measurements. As is often the case in medicine, and specifically the cardiovascular system, estimating these known-unknowns is highly valuable and they serve as targets for therapeutic manipulations. Under this scenario we wish to learn the parameters of the ODE generating each observed time-series, and extrapolate the future of the ODE variables and the observations. We address this task with a variational autoencoder incorporating the known ODE function, called GOKU-net for Generative ODE modeling with Known Unknowns. We first validate our method on videos of single and double pendulums with unknown length or mass; we then apply it to a model of the cardiovascular system. We show that modeling the known-unknowns allows us to successfully discover clinically meaningful unobserved system parameters, leads to much better extrapolation, and enables learning using much smaller training sets.
Effectively modeling phenomena present in highly nonlinear dynamical systems whilst also accurately quantifying uncertainty is a challenging task, which often requires problem-specific techniques. We present a novel, domain-agnostic approach to tackling this problem, using compositions of physics-informed random features, derived from ordinary differential equations. The architecture of our model leverages recent advances in approximate inference for deep Gaussian processes, such as layer-wise weight-space approximations which allow us to incorporate random Fourier features, and stochastic variational inference for approximate Bayesian inference. We provide evidence that our model is capable of capturing highly nonlinear behaviour in real-world multivariate time series data. In addition, we find that our approach achieves comparable performance to a number of other probabilistic models on benchmark regression tasks.
Population synthesis is concerned with the generation of synthetic yet realistic representations of populations. It is a fundamental problem in the modeling of transport where the synthetic populations of micro-agents represent a key input to most agent-based models. In this paper, a new methodological framework for how to grow pools of micro-agents is presented. The model framework adopts a deep generative modeling approach from machine learning based on a Variational Autoencoder (VAE). Compared to the previous population synthesis approaches, including Iterative Proportional Fitting (IPF), Gibbs sampling and traditional generative models such as Bayesian Networks or Hidden Markov Models, the proposed method allows fitting the full joint distribution for high dimensions. The proposed methodology is compared with a conventional Gibbs sampler and a Bayesian Network by using a large-scale Danish trip diary. It is shown that, while these two methods outperform the VAE in the low-dimensional case, they both suffer from scalability issues when the number of modeled attributes increases. It is also shown that the Gibbs sampler essentially replicates the agents from the original sample when the required conditional distributions are estimated as frequency tables. In contrast, the VAE allows addressing the problem of sampling zeros by generating agents that are virtually different from those in the original data but have similar statistical properties. The presented approach can support agent-based modeling at all levels by enabling richer synthetic populations with smaller zones and more detailed individual characteristics.
We propose a new framework named DS-WGAN that integrates the doubly stochastic (DS) structure and the Wasserstein generative adversarial networks (WGAN) to model, estimate, and simulate a wide class of arrival processes with general non-stationary and random arrival rates. Regarding statistical properties, we prove consistency and convergence rate for the estimator solved by the DS-WGAN framework under a non-parametric smoothness condition. Regarding computational efficiency and tractability, we address a challenge in gradient evaluation and model estimation, arised from the discontinuity in the simulator. We then show that the DS-WGAN framework can conveniently facilitate what-if simulation and predictive simulation for future scenarios that are different from the history. Numerical experiments with synthetic and real data sets are implemented to demonstrate the performance of DS-WGAN. The performance is measured from both a statistical perspective and an operational performance evaluation perspective. Numerical experiments suggest that, in terms of performance, the successful model estimation for DS-WGAN only requires a moderate size of representative data, which can be appealing in many contexts of operational management.
Progressively applying Gaussian noise transforms complex data distributions to approximately Gaussian. Reversing this dynamic defines a generative model. When the forward noising process is given by a Stochastic Differential Equation (SDE), Song et al. (2021) demonstrate how the time inhomogeneous drift of the associated reverse-time SDE may be estimated using score-matching. A limitation of this approach is that the forward-time SDE must be run for a sufficiently long time for the final distribution to be approximately Gaussian. In contrast, solving the Schrodinger Bridge problem (SB), i.e. an entropy-regularized optimal transport problem on path spaces, yields diffusions which generate samples from the data distribution in finite time. We present Diffusion SB (DSB), an original approximation of the Iterative Proportional Fitting (IPF) procedure to solve the SB problem, and provide theoretical analysis along with generative modeling experiments. The first DSB iteration recovers the methodology proposed by Song et al. (2021), with the flexibility of using shorter time intervals, as subsequent DSB iterations reduce the discrepancy between the final-time marginal of the forward (resp. backward) SDE with respect to the prior (resp. data) distribution. Beyond generative modeling, DSB offers a widely applicable computational optimal transport tool as the continuous state-space analogue of the popular Sinkhorn algorithm (Cuturi, 2013).
The family of f-divergences is ubiquitously applied to generative modeling in order to adapt the distribution of the model to that of the data. Well-definedness of f-divergences, however, requires the distributions of the data and model to overlap completely in every time step of training. As a result, as soon as the support of distributions of data and model contain non-overlapping portions, gradient based training of the corresponding model becomes hopeless. Recent advances in generative modeling are full of remedies for handling this support mismatch problem: key ideas include either modifying the objective function to integral probability measures (IPMs) that are well-behaved even on disjoint probabilities, or optimizing a well-behaved variational lower bound instead of the true objective. We, on the other hand, establish that a complete change of the objective function is unnecessary, and instead an augmentation of the base measure of the problematic divergence can resolve the issue. Based on this observation, we propose a generative model which leverages the class of Scaled Bregman Divergences and generalizes both f-divergences and Bregman divergences. We analyze this class of divergences and show that with the appropriate choice of base measure it can resolve the support mismatch problem and incorporate geometric information. Finally, we study the performance of the proposed method and demonstrate promising results on MNIST, CelebA and CIFAR-10 datasets.