No Arabic abstract
This paper presents and discusses algorithms, hardware, and software architecture developed by the TEAM CoSTAR (Collaborative SubTerranean Autonomous Robots), competing in the DARPA Subterranean Challenge. Specifically, it presents the techniques utilized within the Tunnel (2019) and Urban (2020) competitions, where CoSTAR achieved 2nd and 1st place, respectively. We also discuss CoSTARs demonstrations in Martian-analog surface and subsurface (lava tubes) exploration. The paper introduces our autonomy solution, referred to as NeBula (Networked Belief-aware Perceptual Autonomy). NeBula is an uncertainty-aware framework that aims at enabling resilient and modular autonomy solutions by performing reasoning and decision making in the belief space (space of probability distributions over the robot and world states). We discuss various components of the NeBula framework, including: (i) geometric and semantic environment mapping; (ii) a multi-modal positioning system; (iii) traversability analysis and local planning; (iv) global motion planning and exploration behavior; (i) risk-aware mission planning; (vi) networking and decentralized reasoning; and (vii) learning-enabled adaptation. We discuss the performance of NeBula on several robot types (e.g. wheeled, legged, flying), in various environments. We discuss the specific results and lessons learned from fielding this solution in the challenging courses of the DARPA Subterranean Challenge competition.
One of the main challenges in autonomous robotic exploration and navigation in unknown and unstructured environments is determining where the robot can or cannot safely move. A significant source of difficulty in this determination arises from stochasticity and uncertainty, coming from localization error, sensor sparsity and noise, difficult-to-model robot-ground interactions, and disturbances to the motion of the vehicle. Classical approaches to this problem rely on geometric analysis of the surrounding terrain, which can be prone to modeling errors and can be computationally expensive. Moreover, modeling the distribution of uncertain traversability costs is a difficult task, compounded by the various error sources mentioned above. In this work, we take a principled learning approach to this problem. We introduce a neural network architecture for robustly learning the distribution of traversability costs. Because we are motivated by preserving the life of the robot, we tackle this learning problem from the perspective of learning tail-risks, i.e. the Conditional Value-at-Risk (CVaR). We show that this approach reliably learns the expected tail risk given a desired probability risk threshold between 0 and 1, producing a traversability costmap which is more robust to outliers, more accurately captures tail risks, and is more computationally efficient, when compared against baselines. We validate our method on data collected a legged robot navigating challenging, unstructured environments including an abandoned subway, limestone caves, and lava tube caves.
This invited paper discusses a new but important problem, supervised autonomy, in the context of robotics. The paper defines supervised autonomy and compares the supervised autonomy with robotic teleoperation and robotic full autonomy. Based on the discussion, the significance of supervised autonomy was introduced. The paper discusses the challenging and unsolved problems in supervised autonomy, and reviews the related works in our research lab. Based on the discussions, the paper draws the conclusion that supervised autonomy is critical for applying robotic systems to address complicated problems in the real world.
This work establishes COMPRA, a compact and reactive autonomy framework for fast deployment of MAVs in subterranean Search-and-Rescue missions. A COMPRA-enabled MAV is able to autonomously explore previously unknown areas while specific mission criteria are considered e.g. an object of interest is identified and localized, the remaining useful battery life, the overall desired exploration mission duration. The proposed architecture follows a low-complexity algorithmic design to facilitate fully on-board computations, including nonlinear control, state-estimation, navigation, exploration behavior and object localization capabilities. The framework is mainly structured around a reactive local avoidance planner, based on enhanced Potential Field concepts and using instantaneous 3D pointclouds, as well as a computationally efficient heading regulation technique, based on contour detection on an instantaneous camera stream. Those techniques decouple the collision-free path generation from the dependency of a global map and are capable of handling imprecise localization occasions. Field experimental verification of the overall architecture is performed in relevant unknown GPS-denied environments.
This work tackles scene understanding for outdoor robotic navigation, solely relying on images captured by an on-board camera. Conventional visual scene understanding interprets the environment based on specific descriptive categories. However, such a representation is not directly interpretable for decision-making and constrains robot operation to a specific domain. Thus, we propose to segment egocentric images directly in terms of how a robot can navigate in them, and tailor the learning problem to an autonomous navigation task. Building around an image segmentation network, we present a generic and scalable affordance-based definition consisting of 3 driveability levels which can be applied to arbitrary scenes. By encoding these levels with soft ordinal labels, we incorporate inter-class distances during learning which improves segmentation compared to standard one-hot labelling. In addition, we propose a navigation-oriented pixel-wise loss weighting method which assigns higher importance to safety-critical areas. We evaluate our approach on large-scale public image segmentation datasets spanning off-road and urban scenes. In a zero-shot cross-dataset generalization experiment, we show that our affordance learning scheme can be applied across a diverse mix of datasets and improves driveability estimation in unseen environments compared to general-purpose, single-dataset segmentation.
We propose Super Odometry, a high-precision multi-modal sensor fusion framework, providing a simple but effective way to fuse multiple sensors such as LiDAR, camera, and IMU sensors and achieve robust state estimation in perceptually-degraded environments. Different from traditional sensor-fusion methods, Super Odometry employs an IMU-centric data processing pipeline, which combines the advantages of loosely coupled methods with tightly coupled methods and recovers motion in a coarse-to-fine manner. The proposed framework is composed of three parts: IMU odometry, visual-inertial odometry, and laser-inertial odometry. The visual-inertial odometry and laser-inertial odometry provide the pose prior to constrain the IMU bias and receive the motion prediction from IMU odometry. To ensure high performance in real-time, we apply a dynamic octree that only consumes 10 % of the running time compared with a static KD-tree. The proposed system was deployed on drones and ground robots, as part of Team Explorers effort to the DARPA Subterranean Challenge where the team won $1^{st}$ and $2^{nd}$ place in the Tunnel and Urban Circuits, respectively.