Fast and Flexible Multi-Task Classification Using Conditional Neural Adaptive Processes

103 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Jonathan Gordon

تاريخ النشر 2019

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف James Requeima - Jonathan Gordon - John Bronskill

التعلم الالي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The goal of this paper is to design image classification systems that, after an initial multi-task training phase, can automatically adapt to new tasks encountered at test time. We introduce a conditional neural process based approach to the multi-task classification setting for this purpose, and establish connections to the meta-learning and few-shot learning literature. The resulting approach, called CNAPs, comprises a classifier whose parameters are modulated by an adaptation network that takes the current tasks dataset as input. We demonstrate that CNAPs achieves state-of-the-art results on the challenging Meta-Dataset benchmark indicating high-quality transfer-learning. We show that the approach is robust, avoiding both over-fitting in low-shot regimes and under-fitting in high-shot regimes. Timing experiments reveal that CNAPs is computationally efficient at test-time as it does not involve gradient based adaptation. Finally, we show that trained models are immediately deployable to continual learning and active learning where they can outperform existing approaches that do not leverage transfer learning.

قيم البحث

331 - Raoul Heese , Michael Bortz 2020

We present a novel adaptive optimization algorithm for black-box multi-objective optimization problems with binary constraints on the foundation of Bayes optimization. Our method is based on probabilistic regression and classification models, which a ct as a surrogate for the optimization goals and allow us to suggest multiple design points at once in each iteration. The proposed acquisition function is intuitively understandable and can be tuned to the demands of the problems at hand. We also present a novel ellipsoid truncation method to speed up the expected hypervolume calculation in a straightforward way for regression models with a normal probability density. We benchmark our approach with an evolutionary algorithm on multiple test problems.

التعلم الالي التعلم الآلي

A Flexible and Adaptive Framework for Abstention Under Class Imbalance

40 - Avanti Shrikumar , Amr Alexandari , Anshul Kundaje 2018

In practical applications of machine learning, it is often desirable to identify and abstain on examples where the models predictions are likely to be incorrect. Much of the prior work on this topic focused on out-of-distribution detection or perform ance metrics such as top-k accuracy. Comparatively little attention was given to metrics such as area-under-the-curve or Cohens Kappa, which are extremely relevant for imbalanced datasets. Abstention strategies aimed at top-k accuracy can produce poor results on these metrics when applied to imbalanced datasets, even when all examples are in-distribution. We propose a framework to address this gap. Our framework leverages the insight that calibrated probability estimates can be used as a proxy for the true class labels, thereby allowing us to estimate the change in an arbitrary metric if an example were abstained on. Using this framework, we derive computationally efficient metric-specific abstention algorithms for optimizing the sensitivity at a target specificity level, the area under the ROC, and the weighted Cohens Kappa. Because our method relies only on calibrated probability estimates, we further show that by leveraging recent work on domain adaptation under label shift, we can generalize to test-set distributions that may have a different class imbalance compared to the training set distribution. On various experiments involving medical imaging, natural language processing, computer vision and genomics, we demonstrate the effectiveness of our approach. Source code available at https://github.com/blindauth/abstention. Colab notebooks reproducing results available at https://github.com/blindauth/abstention_experiments.

التعلم الالي التعلم الآلي

Large Margin Multi-modal Multi-task Feature Extraction for Image Classification

70 - Yong Luo , Yonggang Wen , Dacheng Tao 2019

The features used in many image analysis-based applications are frequently of very high dimension. Feature extraction offers several advantages in high-dimensional cases, and many recent studies have used multi-task feature extraction approaches, whi ch often outperform single-task feature extraction approaches. However, most of these methods are limited in that they only consider data represented by a single type of feature, even though features usually represent images from multiple modalities. We therefore propose a novel large margin multi-modal multi-task feature extraction (LM3FE) framework for handling multi-modal features for image classification. In particular, LM3FE simultaneously learns the feature extraction matrix for each modality and the modality combination coefficients. In this way, LM3FE not only handles correlated and noisy features, but also utilizes the complementarity of different modalities to further help reduce feature redundancy in each modality. The large margin principle employed also helps to extract strongly predictive features so that they are more suitable for prediction (e.g., classification). An alternating algorithm is developed for problem optimization and each sub-problem can be efficiently solved. Experiments on two challenging real-world image datasets demonstrate the effectiveness and superiority of the proposed method.

التعلم الالي الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Double-descent curves in neural networks: a new perspective using Gaussian processes

84 - Ouns El Harzli , Guillermo Valle-Perez , Ard A. Louis 2021

Double-descent curves in neural networks describe the phenomenon that the generalisation error initially descends with increasing parameters, then grows after reaching an optimal number of parameters which is less than the number of data points, but then descends again in the overparameterised regime. Here we use a neural network Gaussian process (NNGP) which maps exactly to a fully connected network (FCN) in the infinite width limit, combined with techniques from random matrix theory, to calculate this generalisation behaviour, with a particular focus on the overparameterised regime. An advantage of our NNGP approach is that the analytical calculations are easier to interpret. We argue that neural network generalization performance improves in the overparameterised regime precisely because that is where they converge to their equivalent Gaussian process.

التعلم الالي التعلم الآلي

Conditional Density Estimation with Neural Networks: Best Practices and Benchmarks

347 - Jonas Rothfuss , Fabio Ferreira , Simon Walther 2019

Given a set of empirical observations, conditional density estimation aims to capture the statistical relationship between a conditional variable $mathbf{x}$ and a dependent variable $mathbf{y}$ by modeling their conditional probability $p(mathbf{y}| mathbf{x})$. The paper develops best practices for conditional density estimation for finance applications with neural networks, grounded on mathematical insights and empirical evaluations. In particular, we introduce a noise regularization and data normalization scheme, alleviating problems with over-fitting, initialization and hyper-parameter sensitivity of such estimators. We compare our proposed methodology with popular semi- and non-parametric density estimators, underpin its effectiveness in various benchmarks on simulated and Euro Stoxx 50 data and show its superior performance. Our methodology allows to obtain high-quality estimators for statistical expectations of higher moments, quantiles and non-linear return transformations, with very little assumptions about the return dynamic.

التعلم الالي التعلم الآلي المالية الحاسوبية