No Arabic abstract
Globally consistent dense maps are a key requirement for long-term robot navigation in complex environments. While previous works have addressed the challenges of dense mapping and global consistency, most require more computational resources than may be available on-board small robots. We propose a framework that creates globally consistent volumetric maps on a CPU and is lightweight enough to run on computationally constrained platforms. Our approach represents the environment as a collection of overlapping Signed Distance Function (SDF) submaps, and maintains global consistency by computing an optimal alignment of the submap collection. By exploiting the underlying SDF representation, we generate correspondence free constraints between submap pairs that are computationally efficient enough to optimize the global problem each time a new submap is added. We deploy the proposed system on a hexacopter Micro Aerial Vehicle (MAV) with an Intel i7-8650U CPU in two realistic scenarios: mapping a large-scale area using a 3D LiDAR, and mapping an industrial space using an RGB-D camera. In the large-scale outdoor experiments, the system optimizes a 120x80m map in less than 4s and produces absolute trajectory RMSEs of less than 1m over 400m trajectories. Our complete system, called voxgraph, is available as open source.
Localization of a robotic system within a previously mapped environment is important for reducing estimation drift and for reusing previously built maps. Existing techniques for geometry-based localization have focused on the description of local surface geometry, usually using pointclouds as the underlying representation. We propose a system for geometry-based localization that extracts features directly from an implicit surface representation: the Signed Distance Function (SDF). The SDF varies continuously through space, which allows the proposed system to extract and utilize features describing both surfaces and free-space. Through evaluations on public datasets, we demonstrate the flexibility of this approach, and show an increase in localization performance over state-of-the-art handcrafted surfaces-only descriptors. We achieve an average improvement of ~12% on an RGB-D dataset and ~18% on a LiDAR-based dataset. Finally, we demonstrate our system for localizing a LiDAR-equipped MAV within a previously built map of a search and rescue training ground.
Metric localization plays a critical role in vision-based navigation. For overcoming the degradation of matching photometry under appearance changes, recent research resorted to introducing geometry constraints of the prior scene structure. In this paper, we present a metric localization method for the monocular camera, using the Signed Distance Field (SDF) as a global map representation. Leveraging the volumetric distance information from SDFs, we aim to relax the assumption of an accurate structure from the local Bundle Adjustment (BA) in previous methods. By tightly coupling the distance factor with temporal visual constraints, our system corrects the odometry drift and jointly optimizes global camera poses with the local structure. We validate the proposed approach on both indoor and outdoor public datasets. Compared to the state-of-the-art methods, it achieves a comparable performance with a minimal sensor configuration.
Neural networks that map 3D coordinates to signed distance function (SDF) or occupancy values have enabled high-fidelity implicit representations of object shape. This paper develops a new shape model that allows synthesizing novel distance views by optimizing a continuous signed directional distance function (SDDF). Similar to deep SDF models, our SDDF formulation can represent whole categories of shapes and complete or interpolate across shapes from partial input data. Unlike an SDF, which measures distance to the nearest surface in any direction, an SDDF measures distance in a given direction. This allows training an SDDF model without 3D shape supervision, using only distance measurements, readily available from depth camera or Lidar sensors. Our model also removes post-processing steps like surface extraction or rendering by directly predicting distance at arbitrary locations and viewing directions. Unlike deep view-synthesis techniques, such as Neural Radiance Fields, which train high-capacity black-box models, our model encodes by construction the property that SDDF values decrease linearly along the viewing direction. This structure constraint not only results in dimensionality reduction but also provides analytical confidence about the accuracy of SDDF predictions, regardless of the distance to the object surface.
We present an approach for multi-robot consistent distributed localization and semantic mapping in an unknown environment, considering scenarios with classification ambiguity, where objects visual appearance generally varies with viewpoint. Our approach addresses such a setting by maintaining a distributed posterior hybrid belief over continuous localization and discrete classification variables. In particular, we utilize a viewpoint-dependent classifier model to leverage the coupling between semantics and geometry. Moreover, our approach yields a consistent estimation of both continuous and discrete variables, with the latter being addressed for the first time, to the best of our knowledge. We evaluate the performance of our approach in a multi-robot semantic SLAM simulation and in a real-world experiment, demonstrating an increase in both classification and localization accuracy compared to maintaining a hybrid belief using local information only.
In many applications, maintaining a consistent map of the environment is key to enabling robotic platforms to perform higher-level decision making. Detection of already visited locations is one of the primary ways in which map consistency is maintained, especially in situations where external positioning systems are unavailable or unreliable. Mapping in 2D is an important field in robotics, largely due to the fact that man-made environments such as warehouses and homes, where robots are expected to play an increasing role, can often be approximated as planar. Place recognition in this context remains challenging: 2D lidar scans contain scant information with which to characterize, and therefore recognize, a location. This paper introduces a novel approach aimed at addressing this problem. At its core, the system relies on the use of the distance function for representation of geometry. This representation allows extraction of features which describe the geometry of both surfaces and free-space in the environment. We propose a feature for this purpose. Through evaluations on public datasets, we demonstrate the utility of free-space in the description of places, and show an increase in localization performance over a state-of-the-art descriptor extracted from surface geometry.