ﻻ يوجد ملخص باللغة العربية
General purpose semantic segmentation relies on a backbone CNN network to extract discriminative features that help classify each image pixel into a seen object class (ie., the object classes available during training) or a background class. Zero-shot semantic segmentation is a challenging task that requires a computer vision model to identify image pixels belonging to an object class which it has never seen before. Equipping a general purpose semantic segmentation model to separate image pixels of unseen classes from the background remains an open challenge. Some recent models have approached this problem by fine-tuning the final pixel classification layer of a semantic segmentation model for a Zero-Shot setting, but struggle to learn discriminative features due to the lack of supervision. We propose a recursive training scheme to supervise the retraining of a semantic segmentation model for a zero-shot setting using a pseudo-feature representation. To this end, we propose a Zero-Shot Maximum Mean Discrepancy (ZS-MMD) loss that weighs high confidence outputs of the pixel classification layer as a pseudo-feature representation, and feeds it back to the generator. By closing-the-loop on the generator end, we provide supervision during retraining that in turn helps the model learn a more discriminative feature representation for unseen classes. We show that using our recursive training and ZS-MMD loss, our proposed model achieves state-of-the-art performance on the Pascal-VOC 2012 dataset and Pascal-Context dataset.
Semantic segmentation models are limited in their ability to scale to large numbers of object classes. In this paper, we introduce the new task of zero-shot semantic segmentation: learning pixel-wise classifiers for never-seen object categories with
zero-shot learning is an essential part of computer vision. As a classical downstream task, zero-shot semantic segmentation has been studied because of its applicant value. One of the popular zero-shot semantic segmentation methods is based on the ge
Being able to segment unseen classes not observed during training is an important technical challenge in deep learning, because of its potential to reduce the expensive annotation required for semantic segmentation. Prior zero-label semantic segmenta
Unlike conventional zero-shot classification, zero-shot semantic segmentation predicts a class label at the pixel level instead of the image level. When solving zero-shot semantic segmentation problems, the need for pixel-level prediction with surrou
Deep learning has significantly improved the precision of instance segmentation with abundant labeled data. However, in many areas like medical and manufacturing, collecting sufficient data is extremely hard and labeling this data requires high profe