No Arabic abstract
We extend first-order model agnostic meta-learning algorithms (including FOMAML and Reptile) to image segmentation, present a novel neural network architecture built for fast learning which we call EfficientLab, and leverage a formal definition of the test error of meta-learning algorithms to decrease error on out of distribution tasks. We show state of the art results on the FSS-1000 dataset by meta-training EfficientLab with FOMAML and using Bayesian optimization to infer the optimal test-time adaptation routine hyperparameters. We also construct a small benchmark dataset, FP-k, for the empirical study of how meta-learning systems perform in both few- and many-shot settings. On the FP-k dataset, we show that meta-learned initializations provide value for canonical few-shot image segmentation but their performance is quickly matched by conventional transfer learning with performance being equal beyond 10 labeled examples. Our code, meta-learned model, and the FP-k dataset are available at https://github.com/ml4ai/mliis .
Building in silico models to predict chemical properties and activities is a crucial step in drug discovery. However, limited labeled data often hinders the application of deep learning in this setting. Meanwhile advances in meta-learning have enabled state-of-the-art performances in few-shot learning benchmarks, naturally prompting the question: Can meta-learning improve deep learning performance in low-resource drug discovery projects? In this work, we assess the transferability of graph neural networks initializations learned by the Model-Agnostic Meta-Learning (MAML) algorithm - and its variants FO-MAML and ANIL - for chemical properties and activities tasks. Using the ChEMBL20 dataset to emulate low-resource settings, our benchmark shows that meta-initializations perform comparably to or outperform multi-task pre-training baselines on 16 out of 20 in-distribution tasks and on all out-of-distribution tasks, providing an average improvement in AUPRC of 11.2% and 26.9% respectively. Finally, we observe that meta-initializations consistently result in the best performing models across fine-tuning sets with $k in {16, 32, 64, 128, 256}$ instances.
Despite its best performance in image denoising, the supervised deep denoising methods require paired noise-clean data, which are often unavailable. To address this challenge, Noise2Noise was designed based on the fact that paired noise-clean images can be replaced by paired noise-noise images that are easier to collect. However, in many scenarios the collection of paired noise-noise images is still impractical. To bypass labeled images, Noise2Void methods predict masked pixels from their surroundings with single noisy images only and give improved denoising results that still need improvements. An observation on classic denoising methods is that non-local mean (NLM) outcomes are typically superior to locally denoised results. In contrast, Noise2Void and its variants do not utilize self-similarities in an image as the NLM-based methods do. Here we propose Noise2Sim, an NLM-inspired self-learning method for image denoising. Specifically, Noise2Sim leverages the self-similarity of image pixels to train the denoising network, requiring single noisy images only. Our theoretical analysis shows that Noise2Sim tends to be equivalent to Noise2Noise under mild conditions. To efficiently manage the computational burden for globally searching similar pixels, we design a two-step procedure to provide data for Noise2Sim training. Extensive experiments demonstrate the superiority of Noise2Sim on common benchmark datasets.
We present a new approach, called meta-meta classification, to learning in small-data settings. In this approach, one uses a large set of learning problems to design an ensemble of learners, where each learner has high bias and low variance and is skilled at solving a specific type of learning problem. The meta-meta classifier learns how to examine a given learning problem and combine the various learners to solve the problem. The meta-meta learning approach is especially suited to solving few-shot learning tasks, as it is easier to learn to classify a new learning problem with little data than it is to apply a learning algorithm to a small data set. We evaluate the approach on a one-shot, one-class-versus-all classification task and show that it is able to outperform traditional meta-learning as well as ensembling approaches.
Active Learning methods create an optimized labeled training set from unlabeled data. We introduce a novel Online Active Deep Learning method for Medical Image Analysis. We extend our MedAL active learning framework to present new results in this paper. Our novel sampling method queries the unlabeled examples that maximize the average distance to all training set examples. Our online method enhances performance of its underlying baseline deep network. These novelties contribute significant performance improvements, including improving the models underlying deep network accuracy by 6.30%, using only 25% of the labeled dataset to achieve baseline accuracy, reducing backpropagated images during training by as much as 67%, and demonstrating robustness to class imbalance in binary and multi-class tasks.
Recent work for image captioning mainly followed an extract-then-generate paradigm, pre-extracting a sequence of object-based features and then formulating image captioning as a single sequence-to-sequence task. Although promising, we observed two problems in generated captions: 1) content inconsistency where models would generate contradicting facts; 2) not informative enough where models would miss parts of important information. From a causal perspective, the reason is that models have captured spurious statistical correlations between visual features and certain expressions (e.g., visual features of long hair and woman). In this paper, we propose a dependent multi-task learning framework with the causal intervention (DMTCI). Firstly, we involve an intermediate task, bag-of-categories generation, before the final task, image captioning. The intermediate task would help the model better understand the visual features and thus alleviate the content inconsistency problem. Secondly, we apply Pearls do-calculus on the model, cutting off the link between the visual features and possible confounders and thus letting models focus on the causal visual features. Specifically, the high-frequency concept set is considered as the proxy confounders where the real confounders are inferred in the continuous space. Finally, we use a multi-agent reinforcement learning (MARL) strategy to enable end-to-end training and reduce the inter-task error accumulations. The extensive experiments show that our model outperforms the baseline models and achieves competitive performance with state-of-the-art models.