ترغب بنشر مسار تعليمي؟ اضغط هنا

Active Boundary Loss for Semantic Segmentation

118   0   0.0 ( 0 )
 نشر من قبل Chi Wang
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

This paper proposes a novel active boundary loss for semantic segmentation. It can progressively encourage the alignment between predicted boundaries and ground-truth boundaries during end-to-end training, which is not explicitly enforced in commonly used cross-entropy loss. Based on the predicted boundaries detected from the segmentation results using current network parameters, we formulate the boundary alignment problem as a differentiable direction vector prediction problem to guide the movement of predicted boundaries in each iteration. Our loss is model-agnostic and can be plugged into the training of segmentation networks to improve the boundary details. Experimental results show that training with the active boundary loss can effectively improve the boundary F-score and mean Intersection-over-Union on challenging image and video object segmentation datasets.

قيم البحث

اقرأ أيضاً

In this paper, we propose a Boundary-aware Graph Reasoning (BGR) module to learn long-range contextual features for semantic segmentation. Rather than directly construct the graph based on the backbone features, our BGR module explores a reasonable w ay to combine segmentation erroneous regions with the graph construction scenario. Motivated by the fact that most hard-to-segment pixels broadly distribute on boundary regions, our BGR module uses the boundary score map as prior knowledge to intensify the graph node connections and thereby guide the graph reasoning focus on boundary regions. In addition, we employ an efficient graph convolution implementation to reduce the computational cost, which benefits the integration of our BGR module into current segmentation backbones. Extensive experiments on three challenging segmentation benchmarks demonstrate the effectiveness of our proposed BGR module for semantic segmentation.
101 - Shuai Zhao , Boxi Wu , Wenqing Chu 2019
Most semantic segmentation models treat semantic segmentation as a pixel-wise classification task and use a pixel-wise classification error as their optimization criterions. However, the pixel-wise error ignores the strong dependencies among the pixe ls in an image, which limits the performance of the model. Several ways to incorporate the structure information of the objects have been investigated, eg, conditional random fields (CRF), image structure priors based methods, and generative adversarial network (GAN). Nevertheless, these methods usually require extra model branches or additional memories, and some of them show limited improvements. In contrast, we propose a simple yet effective structural similarity loss (SSL) to encode the structure information of the objects, which only requires a few additional computational resources in the training phase. Inspired by the widely-used structural similarity (SSIM) index in image quality assessment, we use the linear correlation between two images to quantify their structural similarity. And the goal of the proposed SSL is to pay more attention to the positions, whose associated predictions lead to a low degree of linear correlation between two corresponding regions in the ground truth map and the predicted map. Thus the model can achieve a strong structural similarity between the two maps through minimizing the SSL over the whole map. The experimental results demonstrate that our method can achieve substantial and consistent improvements in performance on the PASCAL VOC 2012 and Cityscapes datasets. The code will be released soon.
Class imbalance is a fundamental problem in computer vision applications such as semantic segmentation. Specifically, uneven class distributions in a training dataset often result in unsatisfactory performance on under-represented classes. Many works have proposed to weight the standard cross entropy loss function with pre-computed weights based on class statistics, such as the number of samples and class margins. There are two major drawbacks to these methods: 1) constantly up-weighting minority classes can introduce excessive false positives in semantic segmentation; 2) a minority class is not necessarily a hard class. The consequence is low precision due to excessive false positives. In this regard, we propose a hard-class mining loss by reshaping the vanilla cross entropy loss such that it weights the loss for each class dynamically based on instantaneous recall performance. We show that the novel recall loss changes gradually between the standard cross entropy loss and the inverse frequency weighted loss. Recall loss also leads to improved mean accuracy while offering competitive mean Intersection over Union (IoU) performance. On Synthia dataset, recall loss achieves 9% relative improvement on mean accuracy with competitive mean IoU using DeepLab-ResNet18 compared to the cross entropy loss. Code available at https://github.com/PotatoTian/recall-semseg.
We focus on tackling weakly supervised semantic segmentation with scribble-level annotation. The regularized loss has been proven to be an effective solution for this task. However, most existing regularized losses only leverage static shallow featur es (color, spatial information) to compute the regularized kernel, which limits its final performance since such static shallow features fail to describe pair-wise pixel relationship in complicated cases. In this paper, we propose a new regularized loss which utilizes both shallow and deep features that are dynamically updated in order to aggregate sufficient information to represent the relationship of different pixels. Moreover, in order to provide accurate deep features, we adopt vision transformer as the backbone and design a feature consistency head to train the pair-wise feature relationship. Unlike most approaches that adopt multi-stage training strategy with many bells and whistles, our approach can be directly trained in an end-to-end manner, in which the feature consistency head and our regularized loss can benefit from each other. Extensive experiments show that our approach achieves new state-of-the-art performances, outperforming other approaches by a significant margin with more than 6% mIoU increase.
246 - Hao Li , Chenxin Tao , Xizhou Zhu 2020
Designing proper loss functions is essential in training deep networks. Especially in the field of semantic segmentation, various evaluation metrics have been proposed for diverse scenarios. Despite the success of the widely adopted cross-entropy los s and its variants, the mis-alignment between the loss functions and evaluation metrics degrades the network performance. Meanwhile, manually designing loss functions for each specific metric requires expertise and significant manpower. In this paper, we propose to automate the design of metric-specific loss functions by searching differentiable surrogate losses for each metric. We substitute the non-differentiable operations in the metrics with parameterized functions, and conduct parameter search to optimize the shape of loss surfaces. Two constraints are introduced to regularize the search space and make the search efficient. Extensive experiments on PASCAL VOC and Cityscapes demonstrate that the searched surrogate losses outperform the manually designed loss functions consistently. The searched losses can generalize well to other datasets and networks. Code shall be released.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا