Semantic Segmentation with Labeling Uncertainty and Class Imbalance

74 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Lucas Prado Osco

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Patrik Ol~a Bressan - Jose Marcato Junior - Jose Augusto Correan Martins

الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Recently, methods based on Convolutional Neural Networks (CNN) achieved impressive success in semantic segmentation tasks. However, challenges such as the class imbalance and the uncertainty in the pixel-labeling process are not completely addressed. As such, we present a new approach that calculates a weight for each pixel considering its class and uncertainty during the labeling process. The pixel-wise weights are used during training to increase or decrease the importance of the pixels. Experimental results show that the proposed approach leads to significant improvements in three challenging segmentation tasks in comparison to baseline methods. It was also proved to be more invariant to noise. The approach presented here may be used within a wide range of semantic segmentation methods to improve their robustness.

قيم البحث

121 - Yanchao Yang , Hanxiang Ren , He Wang 2021

We describe an unsupervised domain adaptation method for image content shift caused by viewpoint changes for a semantic segmentation task. Most existing methods perform domain alignment in a shared space and assume that the mapping from the aligned s pace to the output is transferable. However, the novel content induced by viewpoint changes may nullify such a space for effective alignments, thus resulting in negative adaptation. Our method works without aligning any statistics of the images between the two domains. Instead, it utilizes a view transformation network trained only on color images to hallucinate the semantic images for the target. Despite the lack of supervision, the view transformation network can still generalize to semantic images thanks to the inductive bias introduced by the attention mechanism. Furthermore, to resolve ambiguities in converting the semantic images to semantic labels, we treat the view transformation network as a functional representation of an unknown mapping implied by the color images and propose functional label hallucination to generate pseudo-labels in the target domain. Our method surpasses baselines built on state-of-the-art correspondence estimation and view synthesis methods. Moreover, it outperforms the state-of-the-art unsupervised domain adaptation methods that utilize self-training and adversarial domain alignment. Our code and dataset will be made publicly available.

الرؤية الحاسوبية وتمييز الأنماط

LabOR: Labeling Only if Required for Domain Adaptive Semantic Segmentation

139 - Inkyu Shin , Dong-jin Kim , Jae Won Cho 2021

Unsupervised Domain Adaptation (UDA) for semantic segmentation has been actively studied to mitigate the domain gap between label-rich source data and unlabeled target data. Despite these efforts, UDA still has a long way to go to reach the fully sup ervised performance. To this end, we propose a Labeling Only if Required strategy, LabOR, where we introduce a human-in-the-loop approach to adaptively give scarce labels to points that a UDA model is uncertain about. In order to find the uncertain points, we generate an inconsistency mask using the proposed adaptive pixel selector and we label these segment-based regions to achieve near supervised performance with only a small fraction (about 2.2%) ground truth points, which we call Segment based Pixel-Labeling (SPL). To further reduce the efforts of the human annotator, we also propose Point-based Pixel-Labeling (PPL), which finds the most representative points for labeling within the generated inconsistency mask. This reduces efforts from 2.2% segment label to 40 points label while minimizing performance degradation. Through extensive experimentation, we show the advantages of this new framework for domain adaptive semantic segmentation while minimizing human labor costs.

الرؤية الحاسوبية وتمييز الأنماط

RSS-Net: Weakly-Supervised Multi-Class Semantic Segmentation with FMCW Radar

108 - Prannay Kaul , Daniele De Martini , Matthew Gadd 2020

This paper presents an efficient annotation procedure and an application thereof to end-to-end, rich semantic segmentation of the sensed environment using FMCW scanning radar. We advocate radar over the traditional sensors used for this task as it op erates at longer ranges and is substantially more robust to adverse weather and illumination conditions. We avoid laborious manual labelling by exploiting the largest radar-focused urban autonomy dataset collected to date, correlating radar scans with RGB cameras and LiDAR sensors, for which semantic segmentation is an already consolidated procedure. The training procedure leverages a state-of-the-art natural image segmentation system which is publicly available and as such, in contrast to previous approaches, allows for the production of copious labels for the radar stream by incorporating four camera and two LiDAR streams. Additionally, the losses are computed taking into account labels to the radar sensor horizon by accumulating LiDAR returns along a pose-chain ahead and behind of the current vehicle position. Finally, we present the network with multi-channel radar scan inputs in order to deal with ephemeral and dynamic scene objects.

الرؤية الحاسوبية وتمييز الأنماط علم الروبوتات

Uncertainty-Aware Consistency Regularization for Cross-Domain Semantic Segmentation

185 - Qianyu Zhou , Zhengyang Feng , Qiqi Gu 2020

Unsupervised domain adaptation (UDA) aims to adapt existing models of the source domain to a new target domain with only unlabeled data. Most existing methods suffer from noticeable negative transfer resulting from either the error-prone discriminato r network or the unreasonable teacher model. Besides, the local regional consistency in UDA has been largely neglected, and only extracting the global-level pattern information is not powerful enough for feature alignment due to the abuse use of contexts. To this end, we propose an uncertainty-aware consistency regularization method for cross-domain semantic segmentation. Firstly, we introduce an uncertainty-guided consistency loss with a dynamic weighting scheme by exploiting the latent uncertainty information of the target samples. As such, more meaningful and reliable knowledge from the teacher model can be transferred to the student model. We further reveal the reason why the current consistency regularization is often unstable in minimizing the domain discrepancy. Besides, we design a ClassDrop mask generation algorithm to produce strong class-wise perturbations. Guided by this mask, we propose a ClassOut strategy to realize effective regional consistency in a fine-grained manner. Experiments demonstrate that our method outperforms the state-of-the-art methods on four domain adaptation benchmarks, i.e., GTAV $rightarrow $ Cityscapes and SYNTHIA $rightarrow $ Cityscapes, Virtual KITTI $rightarrow$ KITTI and Cityscapes $rightarrow$ KITTI.

الرؤية الحاسوبية وتمييز الأنماط

Learning Meta-class Memory for Few-Shot Semantic Segmentation

251 - Zhonghua Wu , Xiangxi Shi , Guosheng lin 2021

Currently, the state-of-the-art methods treat few-shot semantic segmentation task as a conditional foreground-background segmentation problem, assuming each class is independent. In this paper, we introduce the concept of meta-class, which is the met a information (e.g. certain middle-level features) shareable among all classes. To explicitly learn meta-class representations in few-shot segmentation task, we propose a novel Meta-class Memory based few-shot segmentation method (MM-Net), where we introduce a set of learnable memory embeddings to memorize the meta-class information during the base class training and transfer to novel classes during the inference stage. Moreover, for the $k$-shot scenario, we propose a novel image quality measurement module to select images from the set of support images. A high-quality class prototype could be obtained with the weighted sum of support image features based on the quality measure. Experiments on both PASCAL-$5^i$ and COCO dataset shows that our proposed method is able to achieve state-of-the-art results in both 1-shot and 5-shot settings. Particularly, our proposed MM-Net achieves 37.5% mIoU on the COCO dataset in 1-shot setting, which is 5.1% higher than the previous state-of-the-art.

الرؤية الحاسوبية وتمييز الأنماط

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الحواش الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Semantic Segmentation with Labeling Uncertainty and Class Imbalance

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً