ﻻ يوجد ملخص باللغة العربية
Depth information matters in RGB-D semantic segmentation task for providing additional geometric information to color images. Most existing methods exploit a multi-stage fusion strategy to propagate depth feature to the RGB branch. However, at the very deep stage, the propagation in a simple element-wise addition manner can not fully utilize the depth information. We propose Global-Local propagation network (GLPNet) to solve this problem. Specifically, a local context fusion module(L-CFM) is introduced to dynamically align both modalities before element-wise fusion, and a global context fusion module(G-CFM) is introduced to propagate the depth information to the RGB branch by jointly modeling the multi-modal global context features. Extensive experiments demonstrate the effectiveness and complementarity of the proposed fusion modules. Embedding two fusion modules into a two-stream encoder-decoder structure, our GLPNet achieves new state-of-the-art performance on two challenging indoor scene segmentation datasets, i.e., NYU-Depth v2 and SUN-RGBD dataset.
Depth information has proven to be a useful cue in the semantic segmentation of RGB-D images for providing a geometric counterpart to the RGB representation. Most existing works simply assume that depth measurements are accurate and well-aligned with
Scene depth information can help visual information for more accurate semantic segmentation. However, how to effectively integrate multi-modality information into representative features is still an open problem. Most of the existing work uses DCNNs
RGB-D semantic segmentation has attracted increasing attention over the past few years. Existing methods mostly employ homogeneous convolution operators to consume the RGB and depth features, ignoring their intrinsic differences. In fact, the RGB val
We introduce 3D-SIS, a novel neural network architecture for 3D semantic instance segmentation in commodity RGB-D scans. The core idea of our method is to jointly learn from both geometric and color signal, thus enabling accurate instance predictions
Interpretation of Airborne Laser Scanning (ALS) point clouds is a critical procedure for producing various geo-information products like 3D city models, digital terrain models and land use maps. In this paper, we present a local and global encoder ne