No Arabic abstract
Decentralized swarm robotic solutions to searching for targets that emit a spatially varying signal promise task parallelism, time efficiency, and fault tolerance. It is, however, challenging for swarm algorithms to offer scalability and efficiency, while preserving mathematical insights into the exhibited behavior. A new decentralized search method (called Bayes-Swarm), founded on batch Bayesian Optimization (BO) principles, is presented here to address these challenges. Unlike swarm heuristics approaches, Bayes-Swarm decouples the knowledge generation and task planning process, thus preserving insights into the emergent behavior. Key contributions lie in: 1) modeling knowledge extraction over trajectories, unlike in BO; 2) time-adaptively balancing exploration/exploitation and using an efficient local penalization approach to account for potential interactions among different robots planned samples; and 3) presenting an asynchronous implementation of the algorithm. This algorithm is tested on case studies with bimodal and highly multimodal signal distributions. Up to 76 times better efficiency is demonstrated compared to an exhaustive search baseline. The benefits of exploitation/exploration balancing, asynchronous planning, and local penalization, and scalability with swarm size, are also demonstrated.
Swarm robotic search is concerned with searching targets in unknown environments (e.g., for search and rescue or hazard localization), using a large number of collaborating simple mobile robots. In such applications, decentralized swarm systems are touted for their task/coverage scalability, time efficiency, and fault tolerance. To guide the behavior of such swarm systems, two broad classes of approaches are available, namely nature-inspired swarm heuristics and multi-robotic search methods. However, simultaneously offering computationally-efficient scalability and fundamental insights into the exhibited behavior (instead of a black-box behavior model), remains challenging under either of these two class of approaches. In this paper, we develop an important extension of the batch Bayesian search method for application to embodied swarm systems, searching in a physical 2D space. Key contributions lie in: 1) designing an acquisition function that not only balances exploration and exploitation across the swarm, but also allows modeling knowledge extraction over trajectories; and 2) developing its distributed implementation to allow asynchronous task inference and path planning by the swarm robots. The resulting collective informative path planning approach is tested on target search case studies of varying complexity, where the target produces a spatially varying (measurable) signal. Significantly superior performance, in terms of mission completion efficiency, is observed compared to exhaustive search and random walk baselines, along with favorable performance scalability with increasing swarm size.
Multiple robotic systems, working together, can provide important solutions to different real-world applications (e.g., disaster response), among which task allocation problems feature prominently. Very few existing decentralized multi-robotic task allocation (MRTA) methods simultaneously offer the following capabilities: consideration of task deadlines, consideration of robot range and task completion capacity limitations, and allowing asynchronous decision-making under dynamic task spaces. To provision these capabilities, this paper presents a computationally efficient algorithm that involves novel construction and matching of bipartite graphs. Its performance is tested on a multi-UAV flood response application.
One of the crucial problems in robotic swarm-based operation is to search and neutralize heterogeneous targets in an unknown and uncertain environment, without any communication within the swarm. Here, some targets can be neutralized by a single robot, while others need multiple robots in a particular sequence to neutralize them. The complexity in the problem arises due to the scalability and information uncertainty, which restricts the robots awareness of the swarm and the target distribution. In this paper, this problem is addressed by proposing a novel Context-Aware Deep Q-Network (CA-DQN) framework to obtain communication free cooperation between the robots in the swarm. Each robot maintains an adaptive grid representation of the vicinity with the context information embedded into it to keep the swarm intact while searching and neutralizing the targets. The problem formulation uses a reinforcement learning framework where two Deep Q-Networks (DQNs) handle conflict and conflict-free scenarios separately. The self-play-in-based approach is used to determine the optimal policy for the DQNs. Monte-Carlo simulations and comparison studies with a state-of-the-art coalition formation algorithm are performed to verify the performance of CA-DQN with varying environmental parameters. The results show that the approach is invariant to the number of detected targets and the number of robots in the swarm. The paper also presents the real-time implementation of CA-DQN for different scenarios using ground robots in a laboratory environment to demonstrate the working of CA-DQN with low-power computing devices.
Target search with unmanned aerial vehicles (UAVs) is relevant problem to many scenarios, e.g., search and rescue (SaR). However, a key challenge is planning paths for maximal search efficiency given flight time constraints. To address this, we propose the Obstacle-aware Adaptive Informative Path Planning (OA-IPP) algorithm for target search in cluttered environments using UAVs. Our approach leverages a layered planning strategy using a Gaussian Process (GP)-based model of target occupancy to generate informative paths in continuous 3D space. Within this framework, we introduce an adaptive replanning scheme which allows us to trade off between information gain, field coverage, sensor performance, and collision avoidance for efficient target detection. Extensive simulations show that our OA-IPP method performs better than state-of-the-art planners, and we demonstrate its application in a realistic urban SaR scenario.
We present HiDe, a novel hierarchical reinforcement learning architecture that successfully solves long horizon control tasks and generalizes to unseen test scenarios. Functional decomposition between planning and low-level control is achieved by explicitly separating the state-action spaces across the hierarchy, which allows the integration of task-relevant knowledge per layer. We propose an RL-based planner to efficiently leverage the information in the planning layer of the hierarchy, while the control layer learns a goal-conditioned control policy. The hierarchy is trained jointly but allows for the composition of different policies such as transferring layers across multiple agents. We experimentally show that our method generalizes across unseen test environments and can scale to tasks well beyond 3x horizon length compared to both learning and non-learning based approaches. We evaluate on complex continuous control tasks with sparse rewards, including navigation and robot manipulation.