ﻻ يوجد ملخص باللغة العربية
Numerous researchers recently applied empirical spectral analysis to the study of modern deep learning classifiers. We identify and discuss an important formal class/cross-class structure and show how it lies at the origin of the many visually striking features observed in deepnet spectra, some of which were reported in recent articles, others are unveiled here for the first time. These include spectral outliers, spikes, and small but distinct continuous distributions, bumps, often seen beyond the edge of a main bulk. The significance of the cross-class structure is illustrated in three ways: (i) we prove the ratio of outliers to bulk in the spectrum of the Fisher information matrix is predictive of misclassification, in the context of multinomial logistic regression; (ii) we demonstrate how, gradually with depth, a network is able to separate class-distinctive information from class variability, all while orthogonalizing the class-distinctive information; and (iii) we propose a correction to KFAC, a well-known second-order optimization algorithm for training deepnets.
Semi-Supervised Learning (SSL) has achieved great success in overcoming the difficulties of labeling and making full use of unlabeled data. However, SSL has a limited assumption that the numbers of samples in different classes are balanced, and many
Normalization techniques have proved to be a crucial ingredient of successful training in a traditional supervised learning regime. However, in the zero-shot learning (ZSL) world, these ideas have received only marginal attention. This work studies n
With the memory-resource-limited constraints, class-incremental learning (CIL) usually suffers from the catastrophic forgetting problem when updating the joint classification model on the arrival of newly added classes. To cope with the forgetting pr
We propose a novel unsupervised generative model that learns to disentangle object identity from other low-level aspects in class-imbalanced data. We first investigate the issues surrounding the assumptions about uniformity made by InfoGAN, and demon
Few-Shot Learning (FSL) algorithms are commonly trained through Meta-Learning (ML), which exposes models to batches of tasks sampled from a meta-dataset to mimic tasks seen during evaluation. However, the standard training procedures overlook the rea