ﻻ يوجد ملخص باللغة العربية
zero-shot learning is an essential part of computer vision. As a classical downstream task, zero-shot semantic segmentation has been studied because of its applicant value. One of the popular zero-shot semantic segmentation methods is based on the generative model Most new proposed works added structures on the same architecture to enhance this model. However, we found that, from the view of causal inference, the result of the original model has been influenced by spurious statistical relationships. Thus the performance of the prediction shows severe bias. In this work, we consider counterfactual methods to avoid the confounder in the original model. Based on this method, we proposed a new framework for zero-shot semantic segmentation. Our model is compared with baseline models on two real-world datasets, Pascal-VOC and Pascal-Context. The experiment results show proposed models can surpass previous confounded models and can still make use of additional structures to improve the performance. We also design a simple structure based on Graph Convolutional Networks (GCN) in this work.
Semantic segmentation models are limited in their ability to scale to large numbers of object classes. In this paper, we introduce the new task of zero-shot semantic segmentation: learning pixel-wise classifiers for never-seen object categories with
Unlike conventional zero-shot classification, zero-shot semantic segmentation predicts a class label at the pixel level instead of the image level. When solving zero-shot semantic segmentation problems, the need for pixel-level prediction with surrou
General purpose semantic segmentation relies on a backbone CNN network to extract discriminative features that help classify each image pixel into a seen object class (ie., the object classes available during training) or a background class. Zero-sho
Deep learning has significantly improved the precision of instance segmentation with abundant labeled data. However, in many areas like medical and manufacturing, collecting sufficient data is extremely hard and labeling this data requires high profe
Convolutional neural networks are able to learn realistic image priors from numerous training samples in low-level image generation and restoration. We show that, for high-level image recognition tasks, we can further reconstruct realistic images of