A Dense Siamese U-Net trained with Edge Enhanced 3D IOU Loss for Image Co-segmentation

89 0 0.0 ( 0 )

Download Cite

Added by Xi Liu

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Xi Liu - Xiabi Liu - Huiyu Li

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Image co-segmentation has attracted a lot of attentions in computer vision community. In this paper, we propose a new approach to image co-segmentation through introducing the dense connections into the decoder path of Siamese U-net and presenting a new edge enhanced 3D IOU loss measured over distance maps. Considering the rigorous mapping between the signed normalized distance map (SNDM) and the binary segmentation mask, we estimate the SNDMs directly from original images and use them to determine the segmentation results. We apply the Siamese U-net for solving this problem and improve its effectiveness by densely connecting each layer with subsequent layers in the decoder path. Furthermore, a new learning loss is designed to measure the 3D intersection over union (IOU) between the generated SNDMs and the labeled SNDMs. The experimental results on commonly used datasets for image co-segmentation demonstrate the effectiveness of our presented dense structure and edge enhanced 3D IOU loss of SNDM. To our best knowledge, they lead to the state-of-the-art performance on the Internet and iCoseg datasets.

rate research

TransClaw U-Net: Claw U-Net with Transformers for Medical Image Segmentation

126 - Yao Chang , Hu Menghan , Zhai Guangtao 2021

In recent years, computer-aided diagnosis has become an increasingly popular topic. Methods based on convolutional neural networks have achieved good performance in medical image segmentation and classification. Due to the limitations of the convolution operation, the long-term spatial features are often not accurately obtained. Hence, we propose a TransClaw U-Net network structure, which combines the convolution operation with the transformer operation in the encoding part. The convolution part is applied for extracting the shallow spatial features to facilitate the recovery of the image resolution after upsampling. The transformer part is used to encode the patches, and the self-attention mechanism is used to obtain global information between sequences. The decoding part retains the bottom upsampling structure for better detail segmentation performance. The experimental results on Synapse Multi-organ Segmentation Datasets show that the performance of TransClaw U-Net is better than other network structures. The ablation experiments also prove the generalization performance of TransClaw U-Net.

Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning

nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation

476 - Fabian Isensee , Jens Petersen , Andre Klein 2018

The U-Net was presented in 2015. With its straight-forward and successful architecture it quickly evolved to a commonly used benchmark in medical image segmentation. The adaptation of the U-Net to novel problems, however, comprises several degrees of freedom regarding the exact architecture, preprocessing, training and inference. These choices are not independent of each other and substantially impact the overall performance. The present paper introduces the nnU-Net (no-new-Net), which refers to a robust and self-adapting framework on the basis of 2D and 3D vanilla U-Nets. We argue the strong case for taking away superfluous bells and whistles of many proposed network designs and instead focus on the remaining aspects that make out the performance and generalizability of a method. We evaluate the nnU-Net in the context of the Medical Segmentation Decathlon challenge, which measures segmentation performance in ten disciplines comprising distinct entities, image modalities, image geometries and dataset sizes, with no manual adjustments between datasets allowed. At the time of manuscript submission, nnU-Net achieves the highest mean dice scores across all classes and seven phase 1 tasks (except class 1 in BrainTumour) in the online leaderboard of the challenge.

Computer Vision and Pattern Recognition

DS-TransUNet:Dual Swin Transformer U-Net for Medical Image Segmentation

259 - Ailiang Lin , Bingzhi Chen , Jiayu Xu 2021

Automatic medical image segmentation has made great progress benefit from the development of deep learning. However, most existing methods are based on convolutional neural networks (CNNs), which fail to build long-range dependencies and global context connections due to the limitation of receptive field in convolution operation. Inspired by the success of Transformer in modeling the long-range contextual information, some researchers have expended considerable efforts in designing the robust variants of Transformer-based U-Net. Moreover, the patch division used in vision transformers usually ignores the pixel-level intrinsic structural features inside each patch. To alleviate these problems, we propose a novel deep medical image segmentation framework called Dual Swin Transformer U-Net (DS-TransUNet), which might be the first attempt to concurrently incorporate the advantages of hierarchical Swin Transformer into both encoder and decoder of the standard U-shaped architecture to enhance the semantic segmentation quality of varying medical images. Unlike many prior Transformer-based solutions, the proposed DS-TransUNet first adopts dual-scale encoder subnetworks based on Swin Transformer to extract the coarse and fine-grained feature representations of different semantic scales. As the core component for our DS-TransUNet, a well-designed Transformer Interactive Fusion (TIF) module is proposed to effectively establish global dependencies between features of different scales through the self-attention mechanism. Furthermore, we also introduce the Swin Transformer block into decoder to further explore the long-range contextual information during the up-sampling process. Extensive experiments across four typical tasks for medical image segmentation demonstrate the effectiveness of DS-TransUNet, and show that our approach significantly outperforms the state-of-the-art methods.

Computer Vision and Pattern Recognition

MultiResUNet : Rethinking the U-Net Architecture for Multimodal Biomedical Image Segmentation

138 - Nabil Ibtehaz , M. Sohel Rahman 2019

In recent years Deep Learning has brought about a breakthrough in Medical Image Segmentation. U-Net is the most prominent deep network in this regard, which has been the most popular architecture in the medical imaging community. Despite outstanding overall performance in segmenting multimodal medical images, from extensive experimentations on challenging datasets, we found out that the classical U-Net architecture seems to be lacking in certain aspects. Therefore, we propose some modifications to improve upon the already state-of-the-art U-Net model. Hence, following the modifications we develop a novel architecture MultiResUNet as the potential successor to the successful U-Net architecture. We have compared our proposed architecture MultiResUNet with the classical U-Net on a vast repertoire of multimodal medical images. Albeit slight improvements in the cases of ideal images, a remarkable gain in performance has been attained for challenging images. We have evaluated our model on five different datasets, each with their own unique challenges, and have obtained a relative improvement in performance of 10.15%, 5.07%, 2.63%, 1.41%, and 0.62% respectively.

Computer Vision and Pattern Recognition

Improving 3D U-Net for Brain Tumor Segmentation by Utilizing Lesion Prior

222 - Po-Yu Kao , Jefferson W. Chen , B.S. Manjunath 2019

We propose a novel, simple and effective method to integrate lesion prior and a 3D U-Net for improving brain tumor segmentation. First, we utilize the ground-truth brain tumor lesions from a group of patients to generate the heatmaps of different types of lesions. These heatmaps are used to create the volume-of-interest (VOI) map which contains prior information about brain tumor lesions. The VOI map is then integrated with the multimodal MR images and input to a 3D U-Net for segmentation. The proposed method is evaluated on a public benchmark dataset, and the experimental results show that the proposed feature fusion method achieves an improvement over the baseline methods. In addition, our proposed method also achieves a competitive performance compared to state-of-the-art methods.

Computer Vision and Pattern Recognition Machine Learning Image and Video Processing