ﻻ يوجد ملخص باللغة العربية
Few-shot segmentation aims at assigning a category label to each image pixel with few annotated samples. It is a challenging task since the dense prediction can only be achieved under the guidance of latent features defined by sparse annotations. Existing meta-learning method tends to fail in generating category-specifically discriminative descriptor when the visual features extracted from support images are marginalized in embedding space. To address this issue, this paper presents an adaptive tuning framework, in which the distribution of latent features across different episodes is dynamically adjusted based on a self-segmentation scheme, augmenting category-specific descriptors for label prediction. Specifically, a novel self-supervised inner-loop is firstly devised as the base learner to extract the underlying semantic features from the support image. Then, gradient maps are calculated by back-propagating self-supervised loss through the obtained features, and leveraged as guidance for augmenting the corresponding elements in embedding space. Finally, with the ability to continuously learn from different episodes, an optimization-based meta-learner is adopted as outer loop of our proposed framework to gradually refine the segmentation results. Extensive experiments on benchmark PASCAL-$5^{i}$ and COCO-$20^{i}$ datasets demonstrate the superiority of our proposed method over state-of-the-art.
Few-shot semantic segmentation is a challenging task of predicting object categories in pixel-wise with only few annotated samples. However, existing approaches still face two main challenges. First, huge feature distinction between support and query
Few-shot segmentation has been attracting a lot of attention due to its effectiveness to segment unseen object classes with a few annotated samples. Most existing approaches use masked Global Average Pooling (GAP) to encode an annotated support image
Reducing the amount of supervision required by neural networks is especially important in the context of semantic segmentation, where collecting dense pixel-level annotations is particularly expensive. In this paper, we address this problem from a ne
Few-shot semantic segmentation aims at learning to segment a target object from a query image using only a few annotated support images of the target class. This challenging task requires to understand diverse levels of visual cues and analyze fine-g
Unsupervised Domain Adaptation (UDA) transfers predictive models from a fully-labeled source domain to an unlabeled target domain. In some applications, however, it is expensive even to collect labels in the source domain, making most previous works