Graph Neural Networks for Decentralized Multi-Robot Submodular Action Selection

117 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Lifeng Zhou

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Lifeng Zhou - Vishnu D. Sharma - Qingbiao Li

علم الروبوتات الذكاء الاصطناعي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In this paper, we develop a learning-based approach for decentralized submodular maximization. We focus on applications where robots are required to jointly select actions, e.g., motion primitives, to maximize team submodular objectives with local communications only. Such applications are essential for large-scale multi-robot coordination such as multi-robot motion planning for area coverage, environment exploration, and target tracking. But the current decentralized submodular maximization algorithms either require assumptions on the inter-robot communication or lose some suboptimal guarantees. In this work, we propose a general-purpose learning architecture towards submodular maximization at scale, with decentralized communications. Particularly, our learning architecture leverages a graph neural network (GNN) to capture local interactions of the robots and learns decentralized decision-making for the robots. We train the learning model by imitating an expert solution and implement the resulting model for decentralized action selection involving local observations and communications only. We demonstrate the performance of our GNN-based learning approach in a scenario of active target coverage with large networks of robots. The simulation results show our approach nearly matches the coverage performance of the expert algorithm, and yet runs several orders faster with more than 30 robots. The results also exhibit our approachs generalization capability in previously unseen scenarios, e.g., larger environments and larger networks of robots.

قيم البحث

200 - Jun Liu , Lifeng Zhou , Pratap Tokekar 2021

In this letter, we consider a distributed submodular maximization problem for multi-robot systems when attacked by adversaries. One of the major challenges for multi-robot systems is to increase resilience against failures or attacks. This is particu larly important for distributed systems under attack as there is no central point of command that can detect, mitigate, and recover from attacks. Instead, a distributed multi-robot system must coordinate effectively to overcome adversarial attacks. In this work, our distributed submodular action selection problem models a broad set of scenarios where each robot in a multi-robot system has multiple action selections that may fulfill a global objective, such as exploration or target tracking. To increase resilience in this context, we propose a fully distributed algorithm to guide each robots action selection when the system is attacked. The proposed algorithm guarantees performance in a worst-case scenario where up to a portion of the robots malfunction due to attacks. Importantly, the proposed algorithm is also consistent, as it is shown to converge to the same solution as a centralized method. Finally, a distributed resilient multi-robot exploration problem is presented to confirm the performance of the proposed algorithm.

علم الروبوتات

Decentralized Structural-RNN for Robot Crowd Navigation with Deep Reinforcement Learning

176 - Shuijing Liu , Peixin Chang , Weihang Liang 2020

Safe and efficient navigation through human crowds is an essential capability for mobile robots. Previous work on robot crowd navigation assumes that the dynamics of all agents are known and well-defined. In addition, the performance of previous meth ods deteriorates in partially observable environments and environments with dense crowds. To tackle these problems, we propose decentralized structural-Recurrent Neural Network (DS-RNN), a novel network that reasons about spatial and temporal relationships for robot decision making in crowd navigation. We train our network with model-free deep reinforcement learning without any expert supervision. We demonstrate that our model outperforms previous methods in challenging crowd navigation scenarios. We successfully transfer the policy learned in the simulator to a real-world TurtleBot 2i.

علم الروبوتات الذكاء الاصطناعي التعلم الآلي

Communication-Aware Multi-robot Coordination with Submodular Maximization

85 - Guangyao Shi , Ishat E Rabban , Lifeng Zhou 2020

Submodular maximization has been widely used in many multi-robot task planning problems including information gathering, exploration, and target tracking. However, the interplay between submodular maximization and communication is rarely explored in the multi-robot setting. In many cases, maximizing the submodular objective may drive the robots in a way so as to disconnect the communication network. Driven by such observations, in this paper, we consider the problem of maximizing submodular function with connectivity constraints. Specifically, we propose a problem called Communication-aware Submodular Maximization (CSM), in which communication maintenance and submodular maximization are jointly considered in the decision-making process. One heuristic algorithm that consists of two stages, i.e. textit{topology generation} and textit{deviation minimization} is proposed. We validate the formulation and algorithm through numerical simulation. We find that our algorithm on average suffers only slightly performance decrease compared to the pure greedy strategy.

علم الروبوتات

Composable Action-Conditioned Predictors: Flexible Off-Policy Learning for Robot Navigation

176 - Gregory Kahn , Adam Villaflor , Pieter Abbeel 2018

A general-purpose intelligent robot must be able to learn autonomously and be able to accomplish multiple tasks in order to be deployed in the real world. However, standard reinforcement learning approaches learn separate task-specific policies and a ssume the reward function for each task is known a priori. We propose a framework that learns event cues from off-policy data, and can flexibly combine these event cues at test time to accomplish different tasks. These event cue labels are not assumed to be known a priori, but are instead labeled using learned models, such as computer vision detectors, and then `backed up in time using an action-conditioned predictive model. We show that a simulated robotic car and a real-world RC car can gather data and train fully autonomously without any human-provided labels beyond those needed to train the detectors, and then at test-time be able to accomplish a variety of different tasks. Videos of the experiments and code can be found at https://github.com/gkahn13/CAPs

علم الروبوتات الذكاء الاصطناعي التعلم الآلي

Neural fidelity warping for efficient robot morphology design

322 - Sha Hu , Zeshi Yang , Greg Mori 2020

We consider the problem of optimizing a robot morphology to achieve the best performance for a target task, under computational resource limitations. The evaluation process for each morphological design involves learning a controller for the design, which can consume substantial time and computational resources. To address the challenge of expensive robot morphology evaluation, we present a continuous multi-fidelity Bayesian Optimization framework that efficiently utilizes computational resources via low-fidelity evaluations. We identify the problem of non-stationarity over fidelity space. Our proposed fidelity warping mechanism can learn representations of learning epochs and tasks to model non-stationary covariances between continuous fidelity evaluations which prove challenging for off-the-shelf stationary kernels. Various experiments demonstrate that our method can utilize the low-fidelity evaluations to efficiently search for the optimal robot morphology, outperforming state-of-the-art methods.

علم الروبوتات الذكاء الاصطناعي التعلم الآلي