No Arabic abstract
Person detection is a crucial task for mobile robots navigating in human-populated environments and LiDAR sensors are promising for this task, given their accurate depth measurements and large field of view. This paper studies existing LiDAR-based person detectors with a particular focus on mobile robot scenarios (e.g. service robot or social robot), where persons are observed more frequently and in much closer ranges, compared to the driving scenarios. We conduct a series of experiments, using the recently released JackRabbot dataset and the state-of-the-art detectors based on 3D or 2D LiDAR sensors (CenterPoint and DR-SPAAM respectively). These experiments revolve around the domain gap between driving and mobile robot scenarios, as well as the modality gap between 3D and 2D LiDAR sensors. For the domain gap, we aim to understand if detectors pretrained on driving datasets can achieve good performance on the mobile robot scenarios, for which there are currently no trained models readily available. For the modality gap, we compare detectors that use 3D or 2D LiDAR, from various aspects, including performance, runtime, localization accuracy, robustness to range and crowdedness. The results from our experiments provide practical insights into LiDAR-based person detection and facilitate informed decisions for relevant mobile robot designs and applications.
We present an active visual search model for finding objects in unknown environments. The proposed algorithm guides the robot towards the sought object using the relevant stimuli provided by the visual sensors. Existing search strategies are either purely reactive or use simplified sensor models that do not exploit all the visual information available. In this paper, we propose a new model that actively extracts visual information via visual attention techniques and, in conjunction with a non-myopic decision-making algorithm, leads the robot to search more relevant areas of the environment. The attention module couples both top-down and bottom-up attention models enabling the robot to search regions with higher importance first. The proposed algorithm is evaluated on a mobile robot platform in a 3D simulated environment. The results indicate that the use of visual attention significantly improves search, but the degree of improvement depends on the nature of the task and the complexity of the environment. In our experiments, we found that performance enhancements of up to 42% in structured and 38% in highly unstructured cluttered environments can be achieved using visual attention mechanisms.
We present VILENS (Visual Inertial Lidar Legged Navigation System), an odometry system for legged robots based on factor graphs. The key novelty is the tight fusion of four different sensor modalities to achieve reliable operation when the individual sensors would otherwise produce degenerate estimation. To minimize leg odometry drift, we extend the robots state with a linear velocity bias term which is estimated online. This bias is only observable because of the tight fusion of this preintegrated velocity factor with vision, lidar, and IMU factors. Extensive experimental validation on the ANYmal quadruped robots is presented, for a total duration of 2 h and 1.8 km traveled. The experiments involved dynamic locomotion over loose rocks, slopes, and mud; these included perceptual challenges, such as dark and dusty underground caverns or open, feature-deprived areas, as well as mobility challenges such as slipping and terrain deformation. We show an average improvement of 62% translational and 51% rotational errors compared to a state-of-the-art loosely coupled approach. To demonstrate its robustness, VILENS was also integrated with a perceptive controller and a local path planner.
Proportional-integral-derivative (PID) control is the most widely used in industrial control, robot control and other fields. However, traditional PID control is not competent when the system cannot be accurately modeled and the operating environment is variable in real time. To tackle these problems, we propose a self-adaptive model-free SAC-PID control approach based on reinforcement learning for automatic control of mobile robots. A new hierarchical structure is developed, which includes the upper controller based on soft actor-critic (SAC), one of the most competitive continuous control algorithms, and the lower controller based on incremental PID controller. Soft actor-critic receives the dynamic information of the mobile robot as input, and simultaneously outputs the optimal parameters of incremental PID controllers to compensate for the error between the path and the mobile robot in real time. In addition, the combination of 24-neighborhood method and polynomial fitting is developed to improve the adaptability of SAC-PID control method to complex environments. The effectiveness of the SAC-PID control method is verified with several different difficulty paths both on Gazebo and real mecanum mobile robot. Futhermore, compared with fuzzy PID control, the SAC-PID method has merits of strong robustness, generalization and real-time performance.
Modern LiDAR-SLAM (L-SLAM) systems have shown excellent results in large-scale, real-world scenarios. However, they commonly have a high latency due to the expensive data association and nonlinear optimization. This paper demonstrates that actively selecting a subset of features significantly improves both the accuracy and efficiency of an L-SLAM system. We formulate the feature selection as a combinatorial optimization problem under a cardinality constraint to preserve the information matrixs spectral attributes. The stochastic-greedy algorithm is applied to approximate the optimal results in real-time. To avoid ill-conditioned estimation, we also propose a general strategy to evaluate the environments degeneracy and modify the feature number online. The proposed feature selector is integrated into a multi-LiDAR SLAM system. We validate this enhanced system with extensive experiments covering various scenarios on two sensor setups and computation platforms. We show that our approach exhibits low localization error and speedup compared to the state-of-the-art L-SLAM systems. To benefit the community, we have released the source code: https://ram-lab.com/file/site/m-loam.
Loop closure detection is an essential component of Simultaneous Localization and Mapping (SLAM) systems, which reduces the drift accumulated over time. Over the years, several deep learning approaches have been proposed to address this task, however their performance has been subpar compared to handcrafted techniques, especially while dealing with reverse loops. In this paper, we introduce the novel LCDNet that effectively detects loop closures in LiDAR point clouds by simultaneously identifying previously visited places and estimating the 6-DoF relative transformation between the current scan and the map. LCDNet is composed of a shared encoder, a place recognition head that extracts global descriptors, and a relative pose head that estimates the transformation between two point clouds. We introduce a novel relative pose head based on the unbalanced optimal transport theory that we implement in a differentiable manner to allow for end-to-end training. Extensive evaluations of LCDNet on multiple real-world autonomous driving datasets show that our approach outperforms state-of-the-art loop closure detection and point cloud registration techniques by a large margin, especially while dealing with reverse loops. Moreover, we integrate our proposed loop closure detection approach into a LiDAR SLAM library to provide a complete mapping system and demonstrate the generalization ability using different sensor setup in an unseen city.