ترغب بنشر مسار تعليمي؟ اضغط هنا

Boundary-sensitive Network for Portrait Segmentation

146   0   0.0 ( 0 )
 نشر من قبل Xianzhi Du
 تاريخ النشر 2017
والبحث باللغة English




اسأل ChatGPT حول البحث

Compared to the general semantic segmentation problem, portrait segmentation has higher precision requirement on boundary area. However, this problem has not been well studied in previous works. In this paper, we propose a boundary-sensitive deep neural network (BSN) for portrait segmentation. BSN introduces three novel techniques. First, an individual boundary-sensitive kernel is proposed by dilating the contour line and assigning the boundary pixels with multi-class labels. Second, a global boundary-sensitive kernel is employed as a position sensitive prior to further constrain the overall shape of the segmentation map. Third, we train a boundary-sensitive attribute classifier jointly with the segmentation network to reinforce the network with semantic boundary shape information. We have evaluated BSN on the current largest public portrait segmentation dataset, i.e, the PFCN dataset, as well as the portrait images collected from other three popular image segmentation datasets: COCO, COCO-Stuff, and PASCAL VOC. Our method achieves the superior quantitative and qualitative performance over state-of-the-arts on all the datasets, especially on the boundary area.



قيم البحث

اقرأ أيضاً

122 - Yuezun Li , Ao Luo , Siwei Lyu 2019
In this paper, we describe a fast and light-weight portrait segmentation method based on a new highly light-weight backbone (HLB) architecture. The core element of HLB is a bottleneck-based factorized block (BFB) that has much fewer parameters than e xisting alternatives while keeping good learning capacity. Consequently, the HLB-based portrait segmentation method can run faster than the existing methods yet retaining the competitive accuracy performance with state-of-the-arts. Experiments conducted on two benchmark datasets demonstrate the effectiveness and efficiency of our method.
Temporal action proposal generation is an important yet challenging problem, since temporal proposals with rich action content are indispensable for analysing real-world videos with long duration and high proportion irrelevant content. This problem r equires methods not only generating proposals with precise temporal boundaries, but also retrieving proposals to cover truth action instances with high recall and high overlap using relatively fewer proposals. To address these difficulties, we introduce an effective proposal generation method, named Boundary-Sensitive Network (BSN), which adopts local to global fashion. Locally, BSN first locates temporal boundaries with high probabilities, then directly combines these boundaries as proposals. Globally, with Boundary-Sensitive Proposal feature, BSN retrieves proposals by evaluating the confidence of whether a proposal contains an action within its region. We conduct experiments on two challenging datasets: ActivityNet-1.3 and THUMOS14, where BSN outperforms other state-of-the-art temporal action proposal generation methods with high recall and high temporal precision. Finally, further experiments demonstrate that by combining existing action classifiers, our method significantly improves the state-of-the-art temporal action detection performance.
185 - Ruigang Niu , Xian Sun , Yu Tian 2020
Semantic segmentation in very high resolution (VHR) aerial images is one of the most challenging tasks in remote sensing image understanding. Most of the current approaches are based on deep convolutional neural networks (DCNNs). However, standard co nvolution with local receptive fields fails in modeling global dependencies. Prior researches have indicated that attention-based methods can capture long-range dependencies and further reconstruct the feature maps for better representation. Nevertheless, limited by the mere perspective of spacial and channel attention and huge computation complexity of self-attention mechanism, it is unlikely to model the effective semantic interdependencies between each pixel-pair of remote sensing data of complex spectra. In this work, we propose a novel attention-based framework named Hybrid Multiple Attention Network (HMANet) to adaptively capture global correlations from the perspective of space, channel and category in a more effective and efficient manner. Concretely, a class augmented attention (CAA) module embedded with a class channel attention (CCA) module can be used to compute category-based correlation and recalibrate the class-level information. Additionally, we introduce a simple yet effective region shuffle attention (RSA) module to reduce feature redundant and improve the efficiency of self-attention mechanism via region-wise representations. Extensive experimental results on the ISPRS Vaihingen and Potsdam benchmark demonstrate the effectiveness and efficiency of our HMANet over other state-of-the-art methods.
With the increase in available large clinical and experimental datasets, there has been substantial amount of work being done on addressing the challenges in the area of biomedical image analysis. Image segmentation, which is crucial for any quantita tive analysis, has especially attracted attention. Recent hardware advancement has led to the success of deep learning approaches. However, although deep learning models are being trained on large datasets, existing methods do not use the information from different learning epochs effectively. In this work, we leverage the information of each training epoch to prune the prediction maps of the subsequent epochs. We propose a novel architecture called feedback attention network (FANet) that unifies the previous epoch mask with the feature map of the current training epoch. The previous epoch mask is then used to provide a hard attention to the learnt feature maps at different convolutional layers. The network also allows to rectify the predictions in an iterative fashion during the test time. We show that our proposed feedback attention model provides a substantial improvement on most segmentation metrics tested on seven publicly available biomedical imaging datasets demonstrating the effectiveness of the proposed FANet.
In medical applications, the same anatomical structures may be observed in multiple modalities despite the different image characteristics. Currently, most deep models for multimodal segmentation rely on paired registered images. However, multimodal paired registered images are difficult to obtain in many cases. Therefore, developing a model that can segment the target objects from different modalities with unpaired images is significant for many clinical applications. In this work, we propose a novel two-stream translation and segmentation unified attentional generative adversarial network (UAGAN), which can perform any-to-any image modality translation and segment the target objects simultaneously in the case where two or more modalities are available. The translation stream is used to capture modality-invariant features of the target anatomical structures. In addition, to focus on segmentation-related features, we add attentional blocks to extract valuable features from the translation stream. Experiments on three-modality brain tumor segmentation indicate that UAGAN outperforms the existing methods in most cases.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا