No Arabic abstract
The idea to distinguish and quantify two important types of uncertainty, often referred to as aleatoric and epistemic, has received increasing attention in machine learning research in the last couple of years. In this paper, we consider ensemble-based approaches to uncertainty quantification. Distinguishing between different types of uncertainty-aware learning algorithms, we specifically focus on Bayesian methods and approaches based on so-called credal sets, which naturally suggest themselves from an ensemble learning point of view. For both approaches, we address the question of how to quantify aleatoric and epistemic uncertainty. The effectiveness of corresponding measures is evaluated and compared in an empirical study on classification with a reject option.
Bayesian Neural Networks (BNNs) place priors over the parameters in a neural network. Inference in BNNs, however, is difficult; all inference methods for BNNs are approximate. In this work, we empirically compare the quality of predictive uncertainty estimates for 10 common inference methods on both regression and classification tasks. Our experiments demonstrate that commonly used metrics (e.g. test log-likelihood) can be misleading. Our experiments also indicate that inference innovations designed to capture structure in the posterior do not necessarily produce high quality posterior approximations.
This work affords new insights into Bayesian CART in the context of structured wavelet shrinkage. The main thrust is to develop a formal inferential framework for Bayesian tree-based regression. We reframe Bayesian CART as a g-type prior which departs from the typical wavelet product priors by harnessing correlation induced by the tree topology. The practically used Bayesian CART priors are shown to attain adaptive near rate-minimax posterior concentration in the supremum norm in regression models. For the fundamental goal of uncertainty quantification, we construct adaptive confidence bands for the regression function with uniform coverage under self-similarity. In addition, we show that tree-posteriors enable optimal inference in the form of efficient confidence sets for smooth functionals of the regression function.
Bayesian optimization is a class of global optimization techniques. It regards the underlying objective function as a realization of a Gaussian process. Although the outputs of Bayesian optimization are random according to the Gaussian process assumption, quantification of this uncertainty is rarely studied in the literature. In this work, we propose a novel approach to assess the output uncertainty of Bayesian optimization algorithms, in terms of constructing confidence regions of the maximum point or value of the objective function. These regions can be computed efficiently, and their confidence levels are guaranteed by newly developed uniform error bounds for sequential Gaussian process regression. Our theory provides a unified uncertainty quantification framework for all existing sequential sampling policies and stopping criteria.
Meta-learning, or learning to learn, offers a principled framework for few-shot learning. It leverages data from multiple related learning tasks to infer an inductive bias that enables fast adaptation on a new task. The application of meta-learning was recently proposed for learning how to demodulate from few pilots. The idea is to use pilots received and stored for offline use from multiple devices in order to meta-learn an adaptation procedure with the aim of speeding up online training on new devices. Standard frequentist learning, which can yield relatively accurate hard classification decisions, is known to be poorly calibrated, particularly in the small-data regime. Poor calibration implies that the soft scores output by the demodulator are inaccurate estimates of the true probability of correct demodulation. In this work, we introduce the use of Bayesian meta-learning via variational inference for the purpose of obtaining well-calibrated few-pilot demodulators. In a Bayesian framework, each neural network weight is represented by a distribution, capturing epistemic uncertainty. Bayesian meta-learning optimizes over the prior distribution of the weights. The resulting Bayesian ensembles offer better calibrated soft decisions, at the computational cost of running multiple instances of the neural network for demodulation. Numerical results for single-input single-output Rayleigh fading channels with transmitters non-linearities are provided that compare symbol error rate and expected calibration error for both frequentist and Bayesian meta-learning, illustrating how the latter is both more accurate and better-calibrated.
Within a Bayesian statistical framework using the standard Skyrme-Hartree-Fcok model, the maximum a posteriori (MAP) values and uncertainties of nuclear matter incompressibility and isovector interaction parameters are inferred from the experimental data of giant resonances and neutron-skin thicknesses of typical heavy nuclei. With the uncertainties of the isovector interaction parameters constrained by the data of the isovector giant dipole resonance and the neutron-skin thickness, we have obtained $K_0 = 223_{-8}^{+7}$ MeV at 68% confidence level using the data of the isoscalar giant monopole resonance in $^{208}$Pb measured at the Research Center for Nuclear Physics (RCNP), Japan, and at the Texas A&M University (TAMU), USA. Although the corresponding $^{120}$Sn data gives a MAP value for $K_0$ about 5 MeV smaller than the $^{208}$Pb data, there are significant overlaps in their posterior probability distribution functions.