ﻻ يوجد ملخص باللغة العربية
Zero-shot learning extends the conventional object classification to the unseen class recognition by introducing semantic representations of classes. Existing approaches predominantly focus on learning the proper mapping function for visual-semantic embedding, while neglecting the effect of learning discriminative visual features. In this paper, we study the significance of the discriminative region localization. We propose a semantic-guided multi-attention localization model, which automatically discovers the most discriminative parts of objects for zero-shot learning without any human annotations. Our model jointly learns cooperative global and local features from the whole object as well as the detected parts to categorize objects based on semantic descriptions. Moreover, with the joint supervision of embedding softmax loss and class-center triplet loss, the model is encouraged to learn features with high inter-class dispersion and intra-class compactness. Through comprehensive experiments on three widely used zero-shot learning benchmarks, we show the efficacy of the multi-attention localization and our proposed approach improves the state-of-the-art results by a considerable margin.
In image recognition, there are many cases where training samples cannot cover all target classes. Zero-shot learning (ZSL) utilizes the class semantic information to classify samples of the unseen categories that have no corresponding samples contai
Contrastive learning has shown superior performance in embedding global and spatial invariant features in computer vision (e.g., image classification). However, its overall success of embedding local and spatial variant features is still limited, esp
Relation classification aims to extract semantic relations between entity pairs from the sentences. However, most existing methods can only identify seen relation classes that occurred during training. To recognize unseen relations at test time, we e
In the process of exploring the world, the curiosity constantly drives humans to cognize new things. Supposing you are a zoologist, for a presented animal image, you can recognize it immediately if you know its class. Otherwise, you would more likely
Semantic segmentation models are limited in their ability to scale to large numbers of object classes. In this paper, we introduce the new task of zero-shot semantic segmentation: learning pixel-wise classifiers for never-seen object categories with