ﻻ يوجد ملخص باللغة العربية
Class imbalance and noisy labels are the norm rather than the exception in many large-scale classification datasets. Nevertheless, most works in machine learning typically assume balanced and clean data. There have been some recent attempts to tackle, on one side, the problem of learning from noisy labels and, on the other side, learning from long-tailed data. Each group of methods make simplifying assumptions about the other. Due to this separation, the proposed solutions often underperform when both assumptions are violated. In this work, we present a simple two-stage approach based on recent advances in self-supervised learning to treat both challenges simultaneously. It consists of, first, task-agnostic self-supervised pre-training, followed by task-specific fine-tuning using an appropriate loss. Most significantly, we find that self-supervised learning approaches are effectively able to cope with severe class imbalance. In addition, the resulting learned representations are also remarkably robust to label noise, when fine-tuned with an imbalance- and noise-resistant loss function. We validate our claims with experiments on CIFAR-10 and CIFAR-100 augmented with synthetic imbalance and noise, as well as the large-scale inherently noisy Clothing-1M dataset.
This is a technical report for CVPR 2021 AliProducts Challenge. AliProducts Challenge is a competition proposed for studying the large-scale and fine-grained commodity image recognition problem encountered by worldleading ecommerce companies. The lar
Multi-label image classification has generated significant interest in recent years and the performance of such systems often suffers from the not so infrequent occurrence of incorrect or missing labels in the training data. In this paper, we extend
We propose a framework using contrastive learning as a pre-training task to perform image classification in the presence of noisy labels. Recent strategies such as pseudo-labeling, sample selection with Gaussian Mixture models, weighted supervised co
Training a deep neural network heavily relies on a large amount of training data with accurate annotations. To alleviate this problem, various methods have been proposed to annotate the data automatically. However, automatically generating annotation
Imperfect labels are ubiquitous in real-world datasets. Several recent successful methods for training deep neural networks (DNNs) robust to label noise have used two primary techniques: filtering samples based on loss during a warm-up phase to curat