Visual SLAM with Graph-Cut Optimized Multi-Plane Reconstruction


Abstract in English

This paper presents a semantic planar SLAM system that improves pose estimation and mapping using cues from an instance planar segmentation network. While the mainstream approaches are using RGB-D sensors, employing a monocular camera with such a system still faces challenges such as robust data association and precise geometric model fitting. In the majority of existing work, geometric model estimation problems such as homography estimation and piece-wise planar reconstruction (PPR) are usually solved by standard (greedy) RANSAC separately and sequentially. However, setting the inlier-outlier threshold is difficult in absence of information about the scene (i.e. the scale). In this work, we revisit these problems and argue that two mentioned geometric models (homographies/3D planes) can be solved by minimizing an energy function that exploits the spatial coherence, i.e. with graph-cut optimization, which also tackles the practical issue when the output of a trained CNN is inaccurate. Moreover, we propose an adaptive parameter setting strategy based on our experiments, and report a comprehensive evaluation on various open-source datasets.

Download