ﻻ يوجد ملخص باللغة العربية
The overall predictive uncertainty of a trained predictor can be decomposed into separate contributions due to epistemic and aleatoric uncertainty. Under a Bayesian formulation, assuming a well-specified model, the two contributions can be exactly expressed (for the log-loss) or bounded (for more general losses) in terms of information-theoretic quantities (Xu and Raginsky, 2020). This paper addresses the study of epistemic uncertainty within an information-theoretic framework in the broader setting of Bayesian meta-learning. A general hierarchical Bayesian model is assumed in which hyperparameters determine the per-task priors of the model parameters. Exact characterizations (for the log-loss) and bounds (for more general losses) are derived for the epistemic uncertainty - quantified by the minimum excess meta-risk (MEMR)- of optimal meta-learning rules. This characterization is leveraged to bring insights into the dependence of the epistemic uncertainty on the number of tasks and on the amount of per-task training data. Experiments are presented that compare the proposed information-theoretic bounds, evaluated via neural mutual information estimators, with the performance of a novel approximate fully Bayesian meta-learning strategy termed Langevin-Stein Bayesian Meta-Learning (LS-BML).
What is the optimal number of independent observations from which a sparse Gaussian Graphical Model can be correctly recovered? Information-theoretic arguments provide a lower bound on the minimum number of samples necessary to perfectly identify the
Pimentel et al. (2020) recently analysed probing from an information-theoretic perspective. They argue that probing should be seen as approximating a mutual information. This led to the rather unintuitive conclusion that representations encode exactl
Machine unlearning refers to mechanisms that can remove the influence of a subset of training data upon request from a trained model without incurring the cost of re-training from scratch. This paper develops a unified PAC-Bayesian framework for mach
Meta-learning, or learning to learn, offers a principled framework for few-shot learning. It leverages data from multiple related learning tasks to infer an inductive bias that enables fast adaptation on a new task. The application of meta-learning w
We study nonconvex optimization landscapes for learning overcomplete representations, including learning (i) sparsely used overcomplete dictionaries and (ii) convolutional dictionaries, where these unsupervised learning problems find many application