ﻻ يوجد ملخص باللغة العربية
For classification tasks, deep neural networks are prone to overfitting in the presence of label noise. Although existing methods are able to alleviate this problem at low noise levels, they encounter significant performance reduction at high noise levels, or even at medium noise levels when the label noise is asymmetric. To train classifiers that are universally robust to all noise levels, and that are not sensitive to any variation in the noise model, we propose a distillation-based framework that incorporates a new subcategory of Positive-Unlabeled learning. In particular, we shall assume that a small subset of any given noisy dataset is known to have correct labels, which we treat as positive, while the remaining noisy subset is treated as unlabeled. Our framework consists of the following two components: (1) We shall generate, via iterative updates, an augmented clean subset with additional reliable positive samples filtered from unlabeled samples; (2) We shall train a teacher model on this larger augmented clean set. With the guidance of the teacher model, we then train a student model on the whole dataset. Experiments were conducted on the CIFAR-10 dataset with synthetic label noise at multiple noise levels for both symmetric and asymmetric noise. The results show that our framework generally outperforms at medium to high noise levels. We also evaluated our framework on Clothing1M, a real-world noisy dataset, and we achieved 2.94% improvement in accuracy over existing state-of-the-art methods.
In Multiple Instance learning (MIL), weak labels are provided at the bag level with only presence/absence information known. However, there is a considerable gap in performance in comparison to a fully supervised model, limiting the practical applica
The real-world data is often susceptible to label noise, which might constrict the effectiveness of the existing state of the art algorithms for ordinal regression. Existing works on ordinal regression do not take label noise into account. We propose
Long-tailed learning has attracted much attention recently, with the goal of improving generalisation for tail classes. Most existing works use supervised learning without considering the prevailing noise in the training dataset. To move long-tailed
In label-noise learning, textit{noise transition matrix}, denoting the probabilities that clean labels flip into noisy labels, plays a central role in building textit{statistically consistent classifiers}. Existing theories have shown that the transi
Although state-of-the-art PDF malware classifiers can be trained with almost perfect test accuracy (99%) and extremely low false positive rate (under 0.1%), it has been shown that even a simple adversary can evade them. A practically useful malware c