ﻻ يوجد ملخص باللغة العربية
Real Time Strategy (RTS) games require macro strategies as well as micro strategies to obtain satisfactory performance since it has large state space, action space, and hidden information. This paper presents a novel hierarchical reinforcement learning model for mastering Multiplayer Online Battle Arena (MOBA) games, a sub-genre of RTS games. The novelty of this work are: (1) proposing a hierarchical framework, where agents execute macro strategies by imitation learning and carry out micromanipulations through reinforcement learning, (2) developing a simple self-learning method to get better sample efficiency for training, and (3) designing a dense reward function for multi-agent cooperation in the absence of game engine or Application Programming Interface (API). Finally, various experiments have been performed to validate the superior performance of the proposed method over other state-of-the-art reinforcement learning algorithms. Agent successfully learns to combat and defeat bronze-level built-in AI with 100% win rate, and experiments show that our method can create a competitive multi-agent for a kind of mobile MOBA game {it King of Glory} in 5v5 mode.
Object-centric representations have recently enabled significant progress in tackling relational reasoning tasks. By building a strong object-centric inductive bias into neural architectures, recent efforts have improved generalization and data effic
Reinforcement learning in multi-agent scenarios is important for real-world applications but presents challenges beyond those seen in single-agent settings. We present an actor-critic algorithm that trains decentralized policies in multi-agent settin
Social learning is a key component of human and animal intelligence. By taking cues from the behavior of experts in their environment, social learners can acquire sophisticated behavior and rapidly adapt to new circumstances. This paper investigates
This paper proposes a definition of system health in the context of multiple agents optimizing a joint reward function. We use this definition as a credit assignment term in a policy gradient algorithm to distinguish the contributions of individual a
Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities; however, common patterns of behavior often emerge among these agents/entities. Our method aims to leverage these commonalit