ترغب بنشر مسار تعليمي؟ اضغط هنا

WaveSNet: Wavelet Integrated Deep Networks for Image Segmentation

116   0   0.0 ( 0 )
 نشر من قبل Qiufu Li
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

In deep networks, the lost data details significantly degrade the performances of image segmentation. In this paper, we propose to apply Discrete Wavelet Transform (DWT) to extract the data details during feature map down-sampling, and adopt Inverse DWT (IDWT) with the extracted details during the up-sampling to recover the details. We firstly transform DWT/IDWT as general network layers, which are applicable to 1D/2D/3D data and various wavelets like Haar, Cohen, and Daubechies, etc. Then, we design wavelet integrated deep networks for image segmentation (WaveSNets) based on various architectures, including U-Net, SegNet, and DeepLabv3+. Due to the effectiveness of the DWT/IDWT in processing data details, experimental results on CamVid, Pascal VOC, and Cityscapes show that our WaveSNets achieve better segmentation performances than their vanil

قيم البحث

اقرأ أيضاً

103 - Qiufu Li , Linlin Shen , Sheng Guo 2020
Convolutional Neural Networks (CNNs) are generally prone to noise interruptions, i.e., small image noise can cause drastic changes in the output. To suppress the noise effect to the final predication, we enhance CNNs by replacing max-pooling, strided -convolution, and average-pooling with Discrete Wavelet Transform (DWT). We present general DWT and Inverse DWT (IDWT) layers applicable to various wavelets like Haar, Daubechies, and Cohen, etc., and design wavelet integrated CNNs (WaveCNets) using these layers for image classification. In WaveCNets, feature maps are decomposed into the low-frequency and high-frequency components during the down-sampling. The low-frequency component stores main information including the basic object structures, which is transmitted into the subsequent layers to extract robust high-level features. The high-frequency components, containing most of the data noise, are dropped during inference to improve the noise-robustness of the WaveCNets. Our experimental results on ImageNet and ImageNet-C (the noisy version of ImageNet) show that WaveCNets, the wavelet integrat
Single encoder-decoder methodologies for semantic segmentation are reaching their peak in terms of segmentation quality and efficiency per number of layers. To address these limitations, we propose a new architecture based on a decoder which uses a s et of shallow networks for capturing more information content. The new decoder has a new topology of skip connections, namely backward and stacked residual connections. In order to further improve the architecture we introduce a weight function which aims to re-balance classes to increase the attention of the networks to under-represented objects. We carried out an extensive set of experiments that yielded state-of-the-art results for the CamVid, Gatech and Freiburg Forest datasets. Moreover, to further prove the effectiveness of our decoder, we conducted a set of experiments studying the impact of our decoder to state-of-the-art segmentation techniques. Additionally, we present a set of experiments augmenting semantic segmentation with optical flow information, showing that motion clues can boost pure image based semantic segmentation approaches.
Though widely used in image classification, convolutional neural networks (CNNs) are prone to noise interruptions, i.e. the CNN output can be drastically changed by small image noise. To improve the noise robustness, we try to integrate CNNs with wav elet by replacing the common down-sampling (max-pooling, strided-convolution, and average pooling) with discrete wavelet transform (DWT). We firstly propose general DWT and inverse DWT (IDWT) layers applicable to various orthogonal and biorthogonal discrete wavelets like Haar, Daubechies, and Cohen, etc., and then design wavelet integrated CNNs (WaveCNets) by integrating DWT into the commonly used CNNs (VGG, ResNets, and DenseNet). During the down-sampling, WaveCNets apply DWT to decompose the feature maps into the low-frequency and high-frequency components. Containing the main information including the basic object structures, the low-frequency component is transmitted into the following layers to generate robust high-level features. The high-frequency components are dropped to remove most of the data noises. The experimental results show that %wavelet accelerates the CNN training, and WaveCNets achieve higher accuracy on ImageNet than various vanilla CNNs. We have also tested the performance of WaveCNets on the noisy version of ImageNet, ImageNet-C and six adversarial attacks, the results suggest that the proposed DWT/IDWT layers could provide better noise-robustness and adversarial robustness. When applying WaveCNets as backbones, the performance of object detectors (i.e., faster R-CNN and RetinaNet) on COCO detection dataset are consistently improved. We believe that suppression of aliasing effect, i.e. separation of low frequency and high frequency information, is the main advantages of our approach. The code of our DWT/IDWT layer and different WaveCNets are available at https://github.com/CVI-SZU/WaveCNet.
In this paper, we present a novel approach to perform deep neural networks layer-wise weight initialization using Linear Discriminant Analysis (LDA). Typically, the weights of a deep neural network are initialized with: random values, greedy layer-wi se pre-training (usually as Deep Belief Network or as auto-encoder) or by re-using the layers from another network (transfer learning). Hence, many training epochs are needed before meaningful weights are learned, or a rather similar dataset is required for seeding a fine-tuning of transfer learning. In this paper, we describe how to turn an LDA into either a neural layer or a classification layer. We analyze the initialization technique on historical documents. First, we show that an LDA-based initialization is quick and leads to a very stable initialization. Furthermore, for the task of layout analysis at pixel level, we investigate the effectiveness of LDA-based initialization and show that it outperforms state-of-the-art random weight initialization methods.
Image segmentation, one of the most critical vision tasks, has been studied for many years. Most of the early algorithms are unsupervised methods, which use hand-crafted features to divide the image into many regions. Recently, owing to the great suc cess of deep learning technology, CNNs based methods show superior performance in image segmentation. However, these methods rely on a large number of human annotations, which are expensive to collect. In this paper, we propose a deep unsupervised method for image segmentation, which contains the following two stages. First, a Superpixelwise Autoencoder (SuperAE) is designed to learn the deep embedding and reconstruct a smoothed image, then the smoothed image is passed to generate superpixels. Second, we present a novel clustering algorithm called Deep Superpixel Cut (DSC), which measures the deep similarity between superpixels and formulates image segmentation as a soft partitioning problem. Via backpropagation, DSC adaptively partitions the superpixels into perceptual regions. Experimental results on the BSDS500 dataset demonstrate the effectiveness of the proposed method.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا