Decentralized Reinforcement Learning for Multi-Target Search and Detection by a Team of Drones

147 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Roi Yehoshua

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Roi Yehoshua - Juan Heredia-Juesas - Yushu Wu

علم الروبوتات التعلم الآلي أنظمة متعددة العملاء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Targets search and detection encompasses a variety of decision problems such as coverage, surveillance, search, observing and pursuit-evasion along with others. In this paper we develop a multi-agent deep reinforcement learning (MADRL) method to coordinate a group of aerial vehicles (drones) for the purpose of locating a set of static targets in an unknown area. To that end, we have designed a realistic drone simulator that replicates the dynamics and perturbations of a real experiment, including statistical inferences taken from experimental data for its modeling. Our reinforcement learning method, which utilized this simulator for training, was able to find near-optimal policies for the drones. In contrast to other state-of-the-art MADRL methods, our method is fully decentralized during both learning and execution, can handle high-dimensional and continuous observation spaces, and does not require tuning of additional hyperparameters.

قيم البحث

115 - El Mahdi El Mhamdi , Rachid Guerraoui , Hadrien Hendrikx 2017

In reinforcement learning, agents learn by performing actions and observing their outcomes. Sometimes, it is desirable for a human operator to textit{interrupt} an agent in order to prevent dangerous situations from happening. Yet, as part of their l earning process, agents may link these interruptions, that impact their reward, to specific states and deliberately avoid them. The situation is particularly challenging in a multi-agent context because agents might not only learn from their own past interruptions, but also from those of other agents. Orseau and Armstrong defined emph{safe interruptibility} for one learner, but their work does not naturally extend to multi-agent systems. This paper introduces textit{dynamic safe interruptibility}, an alternative definition more suited to decentralized learning problems, and studies this notion in two learning frameworks: textit{joint action learners} and textit{independent learners}. We give realistic sufficient conditions on the learning algorithm to enable dynamic safe interruptibility in the case of joint action learners, yet show that these conditions are not sufficient for independent learners. We show however that if agents can detect interruptions, it is possible to prune the observations to ensure dynamic safe interruptibility even for independent learners.

الذكاء الاصطناعي التعلم الآلي أنظمة متعددة العملاء

Reinforcement Learning for UAV Autonomous Navigation, Mapping and Target Detection

77 - Anna Guerra , Francesco Guidi , Davide Dardari 2020

In this paper, we study a joint detection, mapping and navigation problem for a single unmanned aerial vehicle (UAV) equipped with a low complexity radar and flying in an unknown environment. The goal is to optimize its trajectory with the purpose of maximizing the mapping accuracy and, at the same time, to avoid areas where measurements might not be sufficiently informative from the perspective of a target detection. This problem is formulated as a Markov decision process (MDP) where the UAV is an agent that runs either a state estimator for target detection and for environment mapping, and a reinforcement learning (RL) algorithm to infer its own policy of navigation (i.e., the control law). Numerical results show the feasibility of the proposed idea, highlighting the UAVs capability of autonomously exploring areas with high probability of target detection while reconstructing the surrounding environment.

علم الروبوتات التعلم الآلي معالجة الإشارات

Decentralized Circle Formation Control for Fish-like Robots in the Real-world via Reinforcement Learning

97 - Tianhao Zhang , Yueheng Li , Shuai Li 2021

In this paper, the circle formation control problem is addressed for a group of cooperative underactuated fish-like robots involving unknown nonlinear dynamics and disturbances. Based on the reinforcement learning and cognitive consistency theory, we propose a decentralized controller without the knowledge of the dynamics of the fish-like robots. The proposed controller can be transferred from simulation to reality. It is only trained in our established simulation environment, and the trained controller can be deployed to real robots without any manual tuning. Simulation results confirm that the proposed model-free robust formation control method is scalable with respect to the group size of the robots and outperforms other representative RL algorithms. Several experiments in the real world verify the effectiveness of our RL-based approach for circle formation control.

علم الروبوتات الذكاء الاصطناعي أنظمة متعددة العملاء

Collaborative Multi-Robot Systems for Search and Rescue: Coordination and Perception

204 - Jorge Pe~na Queralta , Jussi Taipalmaa , Bilge Can Pullinen 2020

Autonomous or teleoperated robots have been playing increasingly important roles in civil applications in recent years. Across the different civil domains where robots can support human operators, one of the areas where they can have more impact is i n search and rescue (SAR) operations. In particular, multi-robot systems have the potential to significantly improve the efficiency of SAR personnel with faster search of victims, initial assessment and mapping of the environment, real-time monitoring and surveillance of SAR operations, or establishing emergency communication networks, among other possibilities. SAR operations encompass a wide variety of environments and situations, and therefore heterogeneous and collaborative multi-robot systems can provide the most advantages. In this paper, we review and analyze the existing approaches to multi-robot SAR support, from an algorithmic perspective and putting an emphasis on the methods enabling collaboration among the robots as well as advanced perception through machine vision and multi-agent active perception. Furthermore, we put these algorithms in the context of the different challenges and constraints that various types of robots (ground, aerial, surface or underwater) encounter in different SAR environments (maritime, urban, wilderness or other post-disaster scenarios). This is, to the best of our knowledge, the first review considering heterogeneous SAR robots across different environments, while giving two complimentary points of view: control mechanisms and machine perception. Based on our review of the state-of-the-art, we discuss the main open research questions, and outline our insights on the current approaches that have potential to improve the real-world performance of multi-robot SAR systems.

علم الروبوتات التعلم الآلي أنظمة متعددة العملاء

Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability

175 - Shayegan Omidshafiei , Jason Pazis , Christopher Amato 2017

Many real-world tasks involve multiple agents with partial observability and limited communication. Learning is challenging in these settings due to local viewpoints of agents, which perceive the world as non-stationary due to concurrently-exploring teammates. Approaches that learn specialized policies for individual tasks face problems when applied to the real world: not only do agents have to learn and store distinct policies for each task, but in practice identities of tasks are often non-observable, making these approaches inapplicable. This paper formalizes and addresses the problem of multi-task multi-agent reinforcement learning under partial observability. We introduce a decentralized single-task learning approach that is robust to concurrent interactions of teammates, and present an approach for distilling single-task policies into a unified policy that performs well across multiple related tasks, without explicit provision of task identity.

التعلم الآلي الذكاء الاصطناعي أنظمة متعددة العملاء