No Arabic abstract
Analysis of cardiac ultrasound images is commonly performed in routine clinical practice for quantification of cardiac function. Its increasing automation frequently employs deep learning networks that are trained to predict disease or detect image features. However, such models are extremely data-hungry and training requires labelling of many thousands of images by experienced clinicians. Here we propose the use of contrastive learning to mitigate the labelling bottleneck. We train view classification models for imbalanced cardiac ultrasound datasets and show improved performance for views/classes for which minimal labelled data is available. Compared to a naive baseline model, we achieve an improvement in F1 score of up to 26% in those views while maintaining state-of-the-art performance for the views with sufficiently many labelled training observations.
The goal of few-shot classification is to classify new categories with few labeled examples within each class. Nowadays, the excellent performance in handling few-shot classification problems is shown by metric-based meta-learning methods. However, it is very hard for previous methods to discriminate the fine-grained sub-categories in the embedding space without fine-grained labels. This may lead to unsatisfactory generalization to fine-grained subcategories, and thus affects model interpretation. To tackle this problem, we introduce the contrastive loss into few-shot classification for learning latent fine-grained structure in the embedding space. Furthermore, to overcome the drawbacks of random image transformation used in current contrastive learning in producing noisy and inaccurate image pairs (i.e., views), we develop a learning-to-learn algorithm to automatically generate different views of the same image. Extensive experiments on standard few-shot learning benchmarks demonstrate the superiority of our method.
Brain tumor is a common and fatal form of cancer which affects both adults and children. The classification of brain tumors into different types is hence a crucial task, as it greatly influences the treatment that physicians will prescribe. In light of this, medical imaging techniques, especially those applying deep convolutional networks followed by a classification layer, have been developed to make possible computer-aided classification of brain tumor types. In this paper, we present a novel approach of directly learning deep embeddings for brain tumor types, which can be used for downstream tasks such as classification. Along with using triplet loss variants, our approach applies contrastive learning to performing unsupervised pre-training, combined with a rare-case data augmentation module to effectively ameliorate the lack of data problem in the brain tumor imaging analysis domain. We evaluate our method on an extensive brain tumor dataset which consists of 27 different tumor classes, out of which 13 are defined as rare. With a common encoder during all the experiments, we compare our approach with a baseline classification-layer based model, and the results well prove the effectiveness of our approach across all measured metrics.
Previous Online Knowledge Distillation (OKD) often carries out mutually exchanging probability distributions, but neglects the useful representational knowledge. We therefore propose Multi-view Contrastive Learning (MCL) for OKD to implicitly capture correlations of feature embeddings encoded by multiple peer networks, which provide various views for understanding the input data instances. Benefiting from MCL, we can learn a more discriminative representation space for classification than previous OKD methods. Experimental results on image classification demonstrate that our MCL-OKD outperforms other state-of-the-art OKD methods by large margins without sacrificing additional inference cost. Codes are available at https://github.com/winycg/MCL-OKD.
Few-shot learning aims to transfer information from one task to enable generalization on novel tasks given a few examples. This information is present both in the domain and the class labels. In this work we investigate the complementary roles of these two sources of information by combining instance-discriminative contrastive learning and supervised learning in a single framework called Supervised Momentum Contrastive learning (SUPMOCO). Our approach avoids a problem observed in supervised learning where information in images not relevant to the task is discarded, which hampers their generalization to novel tasks. We show that (self-supervised) contrastive learning and supervised learning are mutually beneficial, leading to a new state-of-the-art on the META-DATASET - a recently introduced benchmark for few-shot learning. Our method is based on a simple modification of MOCO and scales better than prior work on combining supervised and self-supervised learning. This allows us to easily combine data from multiple domains leading to further improvements.
For artificial learning systems, continual learning over time from a stream of data is essential. The burgeoning studies on supervised continual learning have achieved great progress, while the study of catastrophic forgetting in unsupervised learning is still blank. Among unsupervised learning methods, self-supervise learning method shows tremendous potential on visual representation without any labeled data at scale. To improve the visual representation of self-supervised learning, larger and more varied data is needed. In the real world, unlabeled data is generated at all times. This circumstance provides a huge advantage for the learning of the self-supervised method. However, in the current paradigm, packing previous data and current data together and training it again is a waste of time and resources. Thus, a continual self-supervised learning method is badly needed. In this paper, we make the first attempt to implement the continual contrastive self-supervised learning by proposing a rehearsal method, which keeps a few exemplars from the previous data. Instead of directly combining saved exemplars with the current data set for training, we leverage self-supervised knowledge distillation to transfer contrastive information among previous data to the current network by mimicking similarity score distribution inferred by the old network over a set of saved exemplars. Moreover, we build an extra sample queue to assist the network to distinguish between previous and current data and prevent mutual interference while learning their own feature representation. Experimental results show that our method performs well on CIFAR100 and ImageNet-Sub. Compared with the baselines, which learning tasks without taking any technique, we improve the image classification top-1 accuracy by 1.60% on CIFAR100, 2.86% on ImageNet-Sub and 1.29% on ImageNet-Full under 10 incremental steps setting.