Multi-Agent Path Planning based on MPC and DDPG

99 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Junxiao Xue

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Junxiao Xue - Xiangyan Kong - Bowei Dong

الذكاء الاصطناعي علم الروبوتات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The problem of mixed static and dynamic obstacle avoidance is essential for path planning in highly dynamic environment. However, the paths formed by grid edges can be longer than the true shortest paths in the terrain since their headings are artificially constrained. Existing methods can hardly deal with dynamic obstacles. To address this problem, we propose a new algorithm combining Model Predictive Control (MPC) with Deep Deterministic Policy Gradient (DDPG). Firstly, we apply the MPC algorithm to predict the trajectory of dynamic obstacles. Secondly, the DDPG with continuous action space is designed to provide learning and autonomous decision-making capability for robots. Finally, we introduce the idea of the Artificial Potential Field to set the reward function to improve convergence speed and accuracy. We employ Unity 3D to perform simulation experiments in highly uncertain environment such as aircraft carrier decks and squares. The results show that our method has made great improvement on accuracy by 7%-30% compared with the other methods, and on the length of the path and turning angle by reducing 100 units and 400-450 degrees compared with DQN (Deep Q Network), respectively.

قيم البحث

328 - Kevin Osanlou , Christophe Guettier , Andrei Bursuc 2021

Learning-based methods are increasingly popular for search algorithms in single-criterion optimization problems. In contrast, for multiple-criteria optimization there are significantly fewer approaches despite the existence of numerous applications. Constrained path-planning for Autonomous Ground Vehicles (AGV) is one such application, where an AGV is typically deployed in disaster relief or search and rescue applications in off-road environments. The agent can be faced with the following dilemma : optimize a source-destination path according to a known criterion and an uncertain criterion under operational constraints. The known criterion is associated to the cost of the path, representing the distance. The uncertain criterion represents the feasibility of driving through the path without requiring human intervention. It depends on various external parameters such as the physics of the vehicle, the state of the explored terrains or weather conditions. In this work, we leverage knowledge acquired through offline simulations by training a neural network model to predict the uncertain criterion. We integrate this model inside a path-planner which can solve problems online. Finally, we conduct experiments on realistic AGV scenarios which illustrate that the proposed framework requires human intervention less frequently, trading for a limited increase in the path distance.

الذكاء الاصطناعي التعلم الآلي علم الروبوتات

Scalable Anytime Planning for Multi-Agent MDPs

160 - Shushman Choudhury , Jayesh K. Gupta , Peter Morales 2021

We present a scalable tree search planning algorithm for large multi-agent sequential decision problems that require dynamic collaboration. Teams of agents need to coordinate decisions in many domains, but naive approaches fail due to the exponential growth of the joint action space with the number of agents. We circumvent this complexity through an anytime approach that allows us to trade computation for approximation quality and also dynamically coordinate actions. Our algorithm comprises three elements: online planning with Monte Carlo Tree Search (MCTS), factored representations of local agent interactions with coordination graphs, and the iterative Max-Plus method for joint action selection. We evaluate our approach on the benchmark SysAdmin domain with static coordination graphs and achieve comparable performance with much lower computation cost than our MCTS baselines. We also introduce a multi-drone delivery domain with dynamic, i.e., state-dependent coordination graphs, and demonstrate how our approach scales to large problems on this domain that are intractable for other MCTS methods. We provide an open-source implementation of our algorithm at https://github.com/JuliaPOMDP/FactoredValueMCTS.jl.

الذكاء الاصطناعي أنظمة متعددة العملاء

Multi-Objective Multi-Agent Planning for Jointly Discovering and Tracking Mobile Object

88 - Hoa Van Nguyen , Hamid Rezatofighi , Ba-Ngu Vo 2019

We consider the challenging problem of online planning for a team of agents to autonomously search and track a time-varying number of mobile objects under the practical constraint of detection range limited onboard sensors. A standard POMDP with a va lue function that either encourages discovery or accurate tracking of mobile objects is inadequate to simultaneously meet the conflicting goals of searching for undiscovered mobile objects whilst keeping track of discovered objects. The planning problem is further complicated by misdetections or false detections of objects caused by range limited sensors and noise inherent to sensor measurements. We formulate a novel multi-objective POMDP based on information theoretic criteria, and an online multi-object tracking filter for the problem. Since controlling multi-agent is a well known combinatorial optimization problem, assigning control actions to agents necessitates a greedy algorithm. We prove that our proposed multi-objective value function is a monotone submodular set function; consequently, the greedy algorithm can achieve a (1-1/e) approximation for maximizing the submodular multi-objective function.

أنظمة متعددة العملاء علم الروبوتات أنظمة وتحكم

Specification mining and automated task planning for autonomous robots based on a graph-based spatial temporal logic

82 - Zhiyu Liu , Meng Jiang , Hai Lin 2020

We aim to enable an autonomous robot to learn new skills from demo videos and use these newly learned skills to accomplish non-trivial high-level tasks. The goal of developing such autonomous robot involves knowledge representation, specification min ing, and automated task planning. For knowledge representation, we use a graph-based spatial temporal logic (GSTL) to capture spatial and temporal information of related skills demonstrated by demo videos. We design a specification mining algorithm to generate a set of parametric GSTL formulas from demo videos by inductively constructing spatial terms and temporal formulas. The resulting parametric GSTL formulas from specification mining serve as a domain theory, which is used in automated task planning for autonomous robots. We propose an automatic task planning based on GSTL where a proposer is used to generate ordered actions, and a verifier is used to generate executable task plans. A table setting example is used throughout the paper to illustrate the main ideas.

الذكاء الاصطناعي علم الروبوتات أنظمة وتحكم

Multiobjective Coverage Path Planning: Enabling Automated Inspection of Complex, Real-World Structures

62 - Kai Olav Ellefsen , Herman A. Lepikson , Jan C. Albiez 2019

An important open problem in robotic planning is the autonomous generation of 3D inspection paths -- that is, planning the best path to move a robot along in order to inspect a target structure. We recently suggested a new method for planning paths a llowing the inspection of complex 3D structures, given a triangular mesh model of the structure. The method differs from previous approaches in its emphasis on generating and considering also plans that result in imperfect coverage of the inspection target. In many practical tasks, one would accept imperfections in coverage if this results in a substantially more energy efficient inspection path. The key idea is using a multiobjective evolutionary algorithm to optimize the energy usage and coverage of inspection plans simultaneously - and the result is a set of plans exploring the different ways to balance the two objectives. We here test our method on a set of inspection targets with large variation in size and complexity, and compare its performance with two state-of-the-art methods for complete coverage path planning. The results strengthen our confidence in the ability of our method to generate good inspection plans for different types of targets. The methods advantage is most clearly seen for real-world inspection targets, since traditional complete coverage methods have no good way of generating plans for structures with hidden parts. Multiobjective evolution, by optimizing energy usage and coverage together ensures a good balance between the two - both when 100% coverage is feasible, and when large parts of the object are hidden.

الذكاء الاصطناعي علم الروبوتات