Learning Invariances using the Marginal Likelihood

87 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Matthias Bauer

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Mark van der Wilk - Matthias Bauer - ST John

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Generalising well in supervised learning tasks relies on correctly extrapolating the training data to a large region of the input space. One way to achieve this is to constrain the predictions to be invariant to transformations on the input that are known to be irrelevant (e.g. translation). Commonly, this is done through data augmentation, where the training set is enlarged by applying hand-crafted transformations to the inputs. We argue that invariances should instead be incorporated in the model structure, and learned using the marginal likelihood, which correctly rewards the reduced complexity of invariant models. We demonstrate this for Gaussian process models, due to the ease with which their marginal likelihood can be estimated. Our main contribution is a variational inference scheme for Gaussian processes containing invariances described by a sampling procedure. We learn the sampling procedure by back-propagating through it to maximise the marginal likelihood.

قيم البحث

87 - Remi Tachet , Philip Bachman , Harm van Seijen 2018

While recent progress has spawned very powerful machine learning systems, those agents remain extremely specialized and fail to transfer the knowledge they gain to similar yet unseen tasks. In this paper, we study a simple reinforcement learning prob lem and focus on learning policies that encode the proper invariances for generalization to different settings. We evaluate three potential methods for policy generalization: data augmentation, meta-learning and adversarial training. We find our data augmentation method to be effective, and study the potential of meta-learning and adversarial learning as alternative task-agnostic approaches.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

Evaluating Probabilistic Inference in Deep Learning: Beyond Marginal Predictions

297 - Xiuyuan Lu , Ian Osband , Benjamin Van Roy 2021

A fundamental challenge for any intelligent system is prediction: given some inputs $X_1,..,X_tau$ can you predict outcomes $Y_1,.., Y_tau$. The KL divergence $mathbf{d}_{mathrm{KL}}$ provides a natural measure of prediction quality, but the majority of deep learning research looks only at the marginal predictions per input $X_t$. In this technical report we propose a scoring rule $mathbf{d}_{mathrm{KL}}^tau$, parameterized by $tau in mathcal{N}$ that evaluates the joint predictions at $tau$ inputs simultaneously. We show that the commonly-used $tau=1$ can be insufficient to drive good decisions in many settings of interest. We also show that, as $tau$ grows, performing well according to $mathbf{d}_{mathrm{KL}}^tau$ recovers universal guarantees for any possible decision. Finally, we provide problem-dependent guidance on the scale of $tau$ for which our score provides sufficient guarantees for good performance.

التعلم الآلي التعلم الالي

Learning Energy-Based Models by Diffusion Recovery Likelihood

105 - Ruiqi Gao , Yang Song , Ben Poole 2020

While energy-based models (EBMs) exhibit a number of desirable properties, training and sampling on high-dimensional datasets remains challenging. Inspired by recent progress on diffusion probabilistic models, we present a diffusion recovery likeliho od method to tractably learn and sample from a sequence of EBMs trained on increasingly noi

التعلم الآلي التعلم الالي

Learning Gaussian Graphical Models With Fractional Marginal Pseudo-likelihood

229 - Janne Leppa-aho , Johan Pensar , Teemu Roos 2016

We propose a Bayesian approximate inference method for learning the dependence structure of a Gaussian graphical model. Using pseudo-likelihood, we derive an analytical expression to approximate the marginal likelihood for an arbitrary graph structur e without invoking any assumptions about decomposability. The majority of the existing methods for learning Gaussian graphical models are either restricted to decomposable graphs or require specification of a tuning parameter that may have a substantial impact on learned structures. By combining a simple sparsity inducing prior for the graph structures with a default reference prior for the model parameters, we obtain a fast and easily applicable scoring function that works well for even high-dimensional data. We demonstrate the favourable performance of our approach by large-scale comparisons against the leading methods for learning non-decomposable Gaussian graphical models. A theoretical justification for our method is provided by showing that it yields a consistent estimator of the graph structure.

التعلم الالي التعلم الآلي

Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning

104 - Alexander Immer , Matthias Bauer , Vincent Fortuin 2021

Marginal-likelihood based model-selection, even though promising, is rarely used in deep learning due to estimation difficulties. Instead, most approaches rely on validation data, which may not be readily available. In this work, we present a scalabl e marginal-likelihood estimation method to select both hyperparameters and network architectures, based on the training data alone. Some hyperparameters can be estimated online during training, simplifying the procedure. Our marginal-likelihood estimate is based on Laplaces method and Gauss-Newton approximations to the Hessian, and it outperforms cross-validation and manual-tuning on standard regression and image classification datasets, especially in terms of calibration and out-of-distribution detection. Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable (e.g., in nonstationary settings).

التعلم الالي التعلم الآلي