No Arabic abstract
Everyday tasks are characterized by their varieties and variations, and frequently are not clearly specified to service agents. This paper presents a comprehensive approach to enable a service agent to deal with everyday tasks in open, uncontrolled environments. We introduce a generic structure for representing tasks, and another structure for representing situations. Based on the two newly introduced structures, we present a methodology of situation handling that avoids hard-coding domain rules while improving the scalability of real-world task planning systems.
We present a scalable tree search planning algorithm for large multi-agent sequential decision problems that require dynamic collaboration. Teams of agents need to coordinate decisions in many domains, but naive approaches fail due to the exponential growth of the joint action space with the number of agents. We circumvent this complexity through an anytime approach that allows us to trade computation for approximation quality and also dynamically coordinate actions. Our algorithm comprises three elements: online planning with Monte Carlo Tree Search (MCTS), factored representations of local agent interactions with coordination graphs, and the iterative Max-Plus method for joint action selection. We evaluate our approach on the benchmark SysAdmin domain with static coordination graphs and achieve comparable performance with much lower computation cost than our MCTS baselines. We also introduce a multi-drone delivery domain with dynamic, i.e., state-dependent coordination graphs, and demonstrate how our approach scales to large problems on this domain that are intractable for other MCTS methods. We provide an open-source implementation of our algorithm at https://github.com/JuliaPOMDP/FactoredValueMCTS.jl.
Task planning in a probabilistic belief state domains allows generating complex and robust execution policies in those domains affected by state uncertainty. The performance of a task planner relies on the belief state representation. However, current belief state representation becomes easily intractable as the number of variables and execution time grows. To address this problem, we developed a novel belief state representation based on cartesian product and union operations over belief substates. These two operations and single variable assignment nodes form And-Or directed acyclic graph of Belief State (AOBS). We show how to apply actions with probabilistic outcomes and measure the probability of conditions holding over belief state. We evaluated AOBS performance in simulated forward state space exploration. We compared the size of AOBS with the size of Binary Decision Diagrams (BDD) that were previously used to represent belief state. We show that AOBS representation is not only much more compact than a full belief state but it also scales better than BDD for most of the cases.
Planning is one of the main approaches used to improve agents working efficiency by making plans beforehand. However, during planning, agents face the risk of having their private information leaked. This paper proposes a novel strong privacy-preserving planning approach for logistic-like problems. This approach outperforms existing approaches by addressing two challenges: 1) simultaneously achieving strong privacy, completeness and efficiency, and 2) addressing communication constraints. These two challenges are prevalent in many real-world applications including logistics in military environments and packet routing in networks. To tackle these two challenges, our approach adopts the differential privacy technique, which can both guarantee strong privacy and control communication overhead. To the best of our knowledge, this paper is the first to apply differential privacy to the field of multi-agent planning as a means of preserving the privacy of agents for logistic-like problems. We theoretically prove the strong privacy and completeness of our approach and empirically demonstrate its efficiency. We also theoretically analyze the communication overhead of our approach and illustrate how differential privacy can be used to control it.
The problem of mixed static and dynamic obstacle avoidance is essential for path planning in highly dynamic environment. However, the paths formed by grid edges can be longer than the true shortest paths in the terrain since their headings are artificially constrained. Existing methods can hardly deal with dynamic obstacles. To address this problem, we propose a new algorithm combining Model Predictive Control (MPC) with Deep Deterministic Policy Gradient (DDPG). Firstly, we apply the MPC algorithm to predict the trajectory of dynamic obstacles. Secondly, the DDPG with continuous action space is designed to provide learning and autonomous decision-making capability for robots. Finally, we introduce the idea of the Artificial Potential Field to set the reward function to improve convergence speed and accuracy. We employ Unity 3D to perform simulation experiments in highly uncertain environment such as aircraft carrier decks and squares. The results show that our method has made great improvement on accuracy by 7%-30% compared with the other methods, and on the length of the path and turning angle by reducing 100 units and 400-450 degrees compared with DQN (Deep Q Network), respectively.
This paper presents a distributed, efficient, scalable and real-time motion planning algorithm for a large group of agents moving in 2 or 3-dimensional spaces. This algorithm enables autonomous agents to generate individual trajectories independently with only the relative position information of neighboring agents. Each agent applies a force-based control that contains two main terms: collision avoidance and navigational feedback. The first term keeps two agents separate with a certain distance, while the second term attracts each agent toward its goal location. Compared with existing collision-avoidance algorithms, the proposed force-based motion planning (FMP) algorithm is able to find collision-free motions with lower transition time, free from velocity state information of neighbouring agents. It leads to less computational overhead. The performance of proposed FMP is examined over several dense and complex 2D and 3D benchmark simulation scenarios, with results outperforming existing methods.