No Arabic abstract
We present Korali, an open-source framework for large-scale Bayesian uncertainty quantification and stochastic optimization. The framework relies on non-intrusive sampling of complex multiphysics models and enables their exploitation for optimization and decision-making. In addition, its distributed sampling engine makes efficient use of massively-parallel architectures while introducing novel fault tolerance and load balancing mechanisms. We demonstrate these features by interfacing Korali with existing high-performance software such as Aphros, Lammps (CPU-based), and Mirheo (GPU-based) and show efficient scaling for up to 512 nodes of the CSCS Piz Daint supercomputer. Finally, we present benchmarks demonstrating that Korali outperforms related state-of-the-art software frameworks.
Bayesian optimization is a class of global optimization techniques. It regards the underlying objective function as a realization of a Gaussian process. Although the outputs of Bayesian optimization are random according to the Gaussian process assumption, quantification of this uncertainty is rarely studied in the literature. In this work, we propose a novel approach to assess the output uncertainty of Bayesian optimization algorithms, in terms of constructing confidence regions of the maximum point or value of the objective function. These regions can be computed efficiently, and their confidence levels are guaranteed by newly developed uniform error bounds for sequential Gaussian process regression. Our theory provides a unified uncertainty quantification framework for all existing sequential sampling policies and stopping criteria.
This work affords new insights into Bayesian CART in the context of structured wavelet shrinkage. The main thrust is to develop a formal inferential framework for Bayesian tree-based regression. We reframe Bayesian CART as a g-type prior which departs from the typical wavelet product priors by harnessing correlation induced by the tree topology. The practically used Bayesian CART priors are shown to attain adaptive near rate-minimax posterior concentration in the supremum norm in regression models. For the fundamental goal of uncertainty quantification, we construct adaptive confidence bands for the regression function with uniform coverage under self-similarity. In addition, we show that tree-posteriors enable optimal inference in the form of efficient confidence sets for smooth functionals of the regression function.
In this paper, a new stochastic framework for parameter estimation and uncertainty quantification in colon cancer-induced angiogenesis, using patient data, is presented. The dynamics of colon cancer is given by a stochastic process that captures the inherent randomness in the system. The stochastic framework is based on the Fokker-Planck equation that represents the evolution of the probability density function corresponding to the stochastic process. An optimization problem is formulated that takes input individual patient data with randomness present, and is solved to obtain the unknown parameters corresponding to the individual tumor characteristics. Furthermore, sensitivity analysis of the optimal parameter set is performed to determine the parameters that need to be controlled, thus, providing information of the type of drugs that can be used for treatment.
Bayesian optimization (BO) is a flexible and powerful framework that is suitable for computationally expensive simulation-based applications and guarantees statistical convergence to the global optimum. While remaining as one of the most popular optimization methods, its capability is hindered by the size of data, the dimensionality of the considered problem, and the nature of sequential optimization. These scalability issues are intertwined with each other and must be tackled simultaneously. In this work, we propose the Scalable$^3$-BO framework, which employs sparse GP as the underlying surrogate model to scope with Big Data and is equipped with a random embedding to efficiently optimize high-dimensional problems with low effective dimensionality. The Scalable$^3$-BO framework is further leveraged with asynchronous parallelization feature, which fully exploits the computational resource on HPC within a computational budget. As a result, the proposed Scalable$^3$-BO framework is scalable in three independent perspectives: with respect to data size, dimensionality, and computational resource on HPC. The goal of this work is to push the frontiers of BO beyond its well-known scalability issues and minimize the wall-clock waiting time for optimizing high-dimensional computationally expensive applications. We demonstrate the capability of Scalable$^3$-BO with 1 million data points, 10,000-dimensional problems, with 20 concurrent workers in an HPC environment.
Within a Bayesian statistical framework using the standard Skyrme-Hartree-Fcok model, the maximum a posteriori (MAP) values and uncertainties of nuclear matter incompressibility and isovector interaction parameters are inferred from the experimental data of giant resonances and neutron-skin thicknesses of typical heavy nuclei. With the uncertainties of the isovector interaction parameters constrained by the data of the isovector giant dipole resonance and the neutron-skin thickness, we have obtained $K_0 = 223_{-8}^{+7}$ MeV at 68% confidence level using the data of the isoscalar giant monopole resonance in $^{208}$Pb measured at the Research Center for Nuclear Physics (RCNP), Japan, and at the Texas A&M University (TAMU), USA. Although the corresponding $^{120}$Sn data gives a MAP value for $K_0$ about 5 MeV smaller than the $^{208}$Pb data, there are significant overlaps in their posterior probability distribution functions.