ﻻ يوجد ملخص باللغة العربية
As the data scale grows, deep recognition models often suffer from long-tailed data distributions due to the heavy imbalanced sample number across categories. Indeed, real-world data usually exhibit some similarity relation among different categories (e.g., pigeons and sparrows), called category similarity in this work. It is doubly difficult when the imbalance occurs between such categories with similar appearances. However, existing solutions mainly focus on the sample number to re-balance data distribution. In this work, we systematically investigate the essence of the long-tailed problem from a unified perspective. Specifically, we demonstrate that long-tailed recognition suffers from both sample number and category similarity. Intuitively, using a toy example, we first show that sample number is not the unique influence factor for performance dropping of long-tailed recognition. Theoretically, we demonstrate that (1) category similarity, as an inevitable factor, would also influence the model learning under long-tailed distribution via similar samples, (2) using more discriminative representation methods (e.g., self-supervised learning) for similarity reduction, the classifier bias can be further alleviated with greatly improved performance. Extensive experiments on several long-tailed datasets verify the rationality of our theoretical analysis, and show that based on existing state-of-the-arts (SOTAs), the performance could be further improved by similarity reduction. Our investigations highlight the essence behind the long-tailed problem, and claim several feasible directions for future work.
Label distributions in real-world are oftentimes long-tailed and imbalanced, resulting in biased models towards dominant labels. While long-tailed recognition has been extensively studied for image classification tasks, limited effort has been made f
The problem of long-tailed recognition, where the number of examples per class is highly unbalanced, is considered. It is hypothesized that the well known tendency of standard classifier training to overfit to popular classes can be exploited for eff
The long-tail distribution of the visual world poses great challenges for deep learning based classification models on how to handle the class imbalance problem. Existing solutions usually involve class-balancing strategies, e.g., by loss re-weightin
Long-tailed problem has been an important topic in face recognition task. However, existing methods only concentrate on the long-tailed distribution of classes. Differently, we devote to the long-tailed domain distribution problem, which refers to th
Real-world imagery is often characterized by a significant imbalance of the number of images per class, leading to long-tailed distributions. An effective and simple approach to long-tailed visual recognition is to learn feature representations and a