Beyond Single Stage Encoder-Decoder Networks: Deep Decoders for Semantic Image Segmentation


Abstract in English

Single encoder-decoder methodologies for semantic segmentation are reaching their peak in terms of segmentation quality and efficiency per number of layers. To address these limitations, we propose a new architecture based on a decoder which uses a set of shallow networks for capturing more information content. The new decoder has a new topology of skip connections, namely backward and stacked residual connections. In order to further improve the architecture we introduce a weight function which aims to re-balance classes to increase the attention of the networks to under-represented objects. We carried out an extensive set of experiments that yielded state-of-the-art results for the CamVid, Gatech and Freiburg Forest datasets. Moreover, to further prove the effectiveness of our decoder, we conducted a set of experiments studying the impact of our decoder to state-of-the-art segmentation techniques. Additionally, we present a set of experiments augmenting semantic segmentation with optical flow information, showing that motion clues can boost pure image based semantic segmentation approaches.

Download