DiResNet: Direction-aware Residual Network for Road Extraction in VHR Remote Sensing Images

88 0 0.0 ( 0 )

Download Cite

Added by Lei Ding

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Lei Ding - Lorenzo Bruzzone

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The binary segmentation of roads in very high resolution (VHR) remote sensing images (RSIs) has always been a challenging task due to factors such as occlusions (caused by shadows, trees, buildings, etc.) and the intra-class variances of road surfaces. The wide use of convolutional neural networks (CNNs) has greatly improved the segmentation accuracy and made the task end-to-end trainable. However, there are still margins to improve in terms of the completeness and connectivity of the results. In this paper, we consider the specific context of road extraction and present a direction-aware residual network (DiResNet) that includes three main contributions: 1) An asymmetric residual segmentation network with deconvolutional layers and a structural supervision to enhance the learning of road topology (DiResSeg); 2) A pixel-level supervision of local directions to enhance the embedding of linear features; 3) A refinement network to optimize the segmentation results (DiResRef). Ablation studies on two benchmark datasets (the Massachusetts dataset and the DeepGlobe dataset) have confirmed the effectiveness of the presented designs. Comparative experiments with other approaches show that the proposed method has advantages in both overall accuracy and F1-score. The code is available at: https://github.com/ggsDing/DiResNet.

rate research

DeepMask: an algorithm for cloud and cloud shadow detection in optical satellite remote sensing images using deep residual network

80 - Ke Xu , Kaiyu Guan , Jian Peng 2019

Detecting and masking cloud and cloud shadow from satellite remote sensing images is a pervasive problem in the remote sensing community. Accurate and efficient detection of cloud and cloud shadow is an essential step to harness the value of remotely sensed data for almost all downstream analysis. DeepMask, a new algorithm for cloud and cloud shadow detection in optical satellite remote sensing imagery, is proposed in this study. DeepMask utilizes ResNet, a deep convolutional neural network, for pixel-level cloud mask generation. The algorithm is trained and evaluated on the Landsat 8 Cloud Cover Assessment Validation Dataset distributed across 8 different land types. Compared with CFMask, the most widely used cloud detection algorithm, land-type-specific DeepMask models achieve higher accuracy across all land types. The average accuracy is 93.56%, compared with 85.36% from CFMask. DeepMask also achieves 91.02% accuracy on all-land-type dataset. Compared with other CNN-based cloud mask algorithms, DeepMask benefits from the parsimonious architecture and the residual connection of ResNet. It is compatible with input of any size and shape. DeepMask still maintains high performance when using only red, green, blue, and NIR bands, indicating its potential to be applied to other satellite platforms that only have limited optical bands.

Computer Vision and Pattern Recognition Image and Video Processing

Dense Attention Fluid Network for Salient Object Detection in Optical Remote Sensing Images

157 - Qijian Zhang , Runmin Cong , Chongyi Li 2020

Despite the remarkable advances in visual saliency analysis for natural scene images (NSIs), salient object detection (SOD) for optical remote sensing images (RSIs) still remains an open and challenging problem. In this paper, we propose an end-to-end Dense Attention Fluid Network (DAFNet) for SOD in optical RSIs. A Global Context-aware Attention (GCA) module is proposed to adaptively capture long-range semantic context relationships, and is further embedded in a Dense Attention Fluid (DAF) structure that enables shallow attention cues flow into deep layers to guide the generation of high-level feature attention maps. Specifically, the GCA module is composed of two key components, where the global feature aggregation module achieves mutual reinforcement of salient feature embeddings from any two spatial locations, and the cascaded pyramid attention module tackles the scale variation issue by building up a cascaded pyramid framework to progressively refine the attention map in a coarse-to-fine manner. In addition, we construct a new and challenging optical RSI dataset for SOD that contains 2,000 images with pixel-wise saliency annotations, which is currently the largest publicly available benchmark. Extensive experiments demonstrate that our proposed DAFNet significantly outperforms the existing state-of-the-art SOD competitors. https://github.com/rmcong/DAFNet_TIP20

Computer Vision and Pattern Recognition

Semantic Attention and Scale Complementary Network for Instance Segmentation in Remote Sensing Images

235 - Tianyang Zhang , Xiangrong Zhang , Peng Zhu 2021

In this paper, we focus on the challenging multicategory instance segmentation problem in remote sensing images (RSIs), which aims at predicting the categories of all instances and localizing them with pixel-level masks. Although many landmark frameworks have demonstrated promising performance in instance segmentation, the complexity in the background and scale variability instances still remain challenging for instance segmentation of RSIs. To address the above problems, we propose an end-to-end multi-category instance segmentation model, namely Semantic Attention and Scale Complementary Network, which mainly consists of a Semantic Attention (SEA) module and a Scale Complementary Mask Branch (SCMB). The SEA module contains a simple fully convolutional semantic segmentation branch with extra supervision to strengthen the activation of interest instances on the feature map and reduce the background noises interference. To handle the under-segmentation of geospatial instances with large varying scales, we design the SCMB that extends the original single mask branch to trident mask branches and introduces complementary mask supervision at different scales to sufficiently leverage the multi-scale information. We conduct comprehensive experiments to evaluate the effectiveness of our proposed method on the iSAID dataset and the NWPU Instance Segmentation dataset and achieve promising performance.

Computer Vision and Pattern Recognition

Convolutional Recurrent Network for Road Boundary Extraction

254 - Justin Liang , Namdar Homayounfar , Wei-Chiu Ma 2020

Creating high definition maps that contain precise information of static elements of the scene is of utmost importance for enabling self driving cars to drive safely. In this paper, we tackle the problem of drivable road boundary extraction from LiDAR and camera imagery. Towards this goal, we design a structured model where a fully convolutional network obtains deep features encoding the location and direction of road boundaries and then, a convolutional recurrent network outputs a polyline representation for each one of them. Importantly, our method is fully automatic and does not require a user in the loop. We showcase the effectiveness of our method on a large North American city where we obtain perfect topology of road boundaries 99.3% of the time at a high precision and recall.

Computer Vision and Pattern Recognition

Reciprocal Translation between SAR and Optical Remote Sensing Images with Cascaded-Residual Adversarial Networks

205 - Shilei Fu , Feng Xu , Ya-Qiu Jin 2019

Despite the advantages of all-weather and all-day high-resolution imaging, synthetic aperture radar (SAR) images are much less viewed and used by general people because human vision is not adapted to microwave scattering phenomenon. However, expert interpreters can be trained by comparing side-by-side SAR and optical images to learn the mapping rules from SAR to optical. This paper attempts to develop machine intelligence that are trainable with large-volume co-registered SAR and optical images to translate SAR image to optical version for assisted SAR image interpretation. Reciprocal SAR-Optical image translation is a challenging task because it is raw data translation between two physically very different sensing modalities. This paper proposes a novel reciprocal adversarial network scheme where cascaded residual connections and hybrid L1-GAN loss are employed. It is trained and tested on both spaceborne GF-3 and airborne UAVSAR images. Results are presented for datasets of different resolutions and polarizations and compared with other state-of-the-art methods. The FID is used to quantitatively evaluate the translation performance. The possibility of unsupervised learning with unpaired SAR and optical images is also explored. Results show that the proposed translation network works well under many scenarios and it could potentially be used for assisted SAR interpretation.

Computer Vision and Pattern Recognition