ﻻ يوجد ملخص باللغة العربية
In many real-world problems, a team of agents need to collaborate to maximize the common reward. Although existing works formulate this problem into a centralized learning with decentralized execution framework, which avoids the non-stationary problem in training, their decentralized execution paradigm limits the agents capability to coordinate. Inspired by the concept of correlated equilibrium, we propose to introduce a coordination signal to address this limitation, and theoretically show that following mild conditions, decentralized agents with the coordination signal can coordinate their individual policies as manipulated by a centralized controller. The idea of introducing coordination signal is to encapsulate coordinated strategies into the signals, and use the signals to instruct the collaboration in decentralized execution. To encourage agents to learn to exploit the coordination signal, we propose Signal Instructed Coordination (SIC), a novel coordination module that can be integrated with most existing MARL frameworks. SIC casts a common signal sampled from a pre-defined distribution to all agents, and introduces an information-theoretic regularization to facilitate the consistency between the observed signal and agents policies. Our experiments show that SIC consistently improves performance over well-recognized MARL models in both matrix games and a predator-prey game with high-dimensional strategy space.
Unsupervised skill discovery drives intelligent agents to explore the unknown environment without task-specific reward signal, and the agents acquire various skills which may be useful when the agents adapt to new tasks. In this paper, we propose Mul
Many real-world scenarios involve teams of agents that have to coordinate their actions to reach a shared goal. We focus on the setting in which a team of agents faces an opponent in a zero-sum, imperfect-information game. Team members can coordinate
Cooperative multi-agent reinforcement learning often requires decentralised policies, which severely limit the agents ability to coordinate their behaviour. In this paper, we show that common knowledge between agents allows for complex decentralised
The development of intelligent traffic light control systems is essential for smart transportation management. While some efforts have been made to optimize the use of individual traffic lights in an isolated way, related studies have largely ignored
We present a multi-agent learning algorithm, ALMA-Learning, for efficient and fair allocations in large-scale systems. We circumvent the traditional pitfalls of multi-agent learning (e.g., the moving target problem, the curse of dimensionality, or th