Explaining Bayesian Neural Networks


Abstract in English

To make advanced learning machines such as Deep Neural Networks (DNNs) more transparent in decision making, explainable AI (XAI) aims to provide interpretations of DNNs predictions. These interpretations are usually given in the form of heatmaps, each one illustrating relevant patterns regarding the prediction for a given instance. Bayesian approaches such as Bayesian Neural Networks (BNNs) so far have a limited form of transparency (model transparency) already built-in through their prior weight distribution, but notably, they lack explanations of their predictions for given instances. In this work, we bring together these two perspectives of transparency into a holistic explanation framework for explaining BNNs. Within the Bayesian framework, the network weights follow a probability distribution. Hence, the standard (deterministic) prediction strategy of DNNs extends in BNNs to a predictive distribution, and thus the standard explanation extends to an explanation distribution. Exploiting this view, we uncover that BNNs implicitly employ multiple heterogeneous prediction strategies. While some of these are inherited from standard DNNs, others are revealed to us by considering the inherent uncertainty in BNNs. Our quantitative and qualitative experiments on toy/benchmark data and real-world data from pathology show that the proposed approach of explaining BNNs can lead to more effective and insightful explanations.

Download