No Arabic abstract
Accurate localization is fundamental to a variety of applications, such as navigation, robotics, autonomous driving, and Augmented Reality (AR). Different from incremental localization, global localization has no drift caused by error accumulation, which is desired in many application scenarios. In addition to GPS used in the open air, 3D maps are also widely used as alternative global localization references. In this paper, we propose a compact 3D map-based global localization system using a low-cost monocular camera and an IMU (Inertial Measurement Unit). The proposed compact map consists of two types of simplified elements with multiple semantic labels, which is well adaptive to various man-made environments like urban environments. Also, semantic edge features are used for the key image-map registration, which is robust against occlusion and long-term appearance changes in the environments. To further improve the localization performance, the key semantic edge alignment is formulated as an optimization problem based on initial poses predicted by an independent VIO (Visual-Inertial Odometry) module. The localization system is realized with modular design in real time. We evaluate the localization accuracy through real-world experimental results compared with ground truth, long-term localization performance is also demonstrated.
In this paper, we introduce an approach to tracking the pose of a monocular camera in a prior surfel map. By rendering vertex and normal maps from the prior surfel map, the global planar information for the sparse tracked points in the image frame is obtained. The tracked points with and without the global planar information involve both global and local constraints of frames to the system. Our approach formulates all constraints in the form of direct photometric errors within a local window of the frames. The final optimization utilizes these constraints to provide the accurate estimation of global 6-DoF camera poses with the absolute scale. The extensive simulation and real-world experiments demonstrate that our monocular method can provide accurate camera localization results under various conditions.
Metric localization plays a critical role in vision-based navigation. For overcoming the degradation of matching photometry under appearance changes, recent research resorted to introducing geometry constraints of the prior scene structure. In this paper, we present a metric localization method for the monocular camera, using the Signed Distance Field (SDF) as a global map representation. Leveraging the volumetric distance information from SDFs, we aim to relax the assumption of an accurate structure from the local Bundle Adjustment (BA) in previous methods. By tightly coupling the distance factor with temporal visual constraints, our system corrects the odometry drift and jointly optimizes global camera poses with the local structure. We validate the proposed approach on both indoor and outdoor public datasets. Compared to the state-of-the-art methods, it achieves a comparable performance with a minimal sensor configuration.
Reliably assessing the error in an estimated vehicle position is integral for ensuring the vehicles safety in urban environments. Many existing approaches use GNSS measurements to characterize protection levels (PLs) as probabilistic upper bounds on the position error. However, GNSS signals might be reflected or blocked in urban environments, and thus additional sensor modalities need to be considered to determine PLs. In this paper, we propose a novel approach for computing PLs by matching camera image measurements to a LiDAR-based 3D map of the environment. We specify a Gaussian mixture model probability distribution of position error using deep neural network-based data-driven models and statistical outlier weighting techniques. From the probability distribution, we compute the PLs by evaluating the position error bound using numerical line-search methods. Through experimental validation with real-world data, we demonstrate that the PLs computed from our method are reliable bounds on the position error in urban environments.
We present a novel method for visual mapping and localization for autonomous vehicles, by extracting, modeling, and optimizing semantic road elements. Specifically, our method integrates cascaded deep models to detect standardized road elements instead of traditional point features, to seek for improved pose accuracy and map representation compactness. To utilize the structural features, we model road lights and signs by their representative deep keypoints for skeleton and boundary, and parameterize lanes via piecewise cubic splines. Based on the road semantic features, we build a complete pipeline for mapping and localization, which includes a) image processing front-end, b) sensor fusion strategies, and c) optimization backend. Experiments on public datasets and our testing platform have demonstrated the effectiveness and advantages of our method by outperforming traditional approaches.
Localization is a crucial capability for mobile robots and autonomous cars. In this paper, we address learning an observation model for Monte-Carlo localization using 3D LiDAR data. We propose a novel, neural network-based observation model that computes the expected overlap of two 3D LiDAR scans. The model predicts the overlap and yaw angle offset between the current sensor reading and virtual frames generated from a pre-built map. We integrate this observation model into a Monte-Carlo localization framework and tested it on urban datasets collected with a car in different seasons. The experiments presented in this paper illustrate that our method can reliably localize a vehicle in typical urban environments. We furthermore provide comparisons to a beam-end point and a histogram-based method indicating a superior global localization performance of our method with fewer particles.