ﻻ يوجد ملخص باللغة العربية
Highly dynamic mobile ad-hoc networks (MANETs) are continuing to serve as one of the most challenging environments to develop and deploy robust, efficient, and scalable routing protocols. In this paper, we present DeepCQ+ routing which, in a novel manner, integrates emerging multi-agent deep reinforcement learning (MADRL) techniques into existing Q-learning-based routing protocols and their variants, and achieves persistently higher performance across a wide range of MANET configurations while training only on a limited range of network parameters and conditions. Quantitatively, DeepCQ+ shows consistently higher end-to-end throughput with lower overhead compared to its Q-learning-based counterparts with the overall gain of 10-15% in its efficiency. Qualitatively and more significantly, DeepCQ+ maintains remarkably similar performance gains under many scenarios that it was not trained for in terms of network sizes, mobility conditions, and traffic dynamics. To the best of our knowledge, this is the first successful demonstration of MADRL for the MANET routing problem that achieves and maintains a high degree of scalability and robustness even in the environments that are outside the trained range of scenarios. This implies that the proposed hybrid design approach of DeepCQ+ that combines MADRL and Q-learning significantly increases its practicality and explainability because the real-world MANET environment will likely vary outside the trained range of MANET scenarios.
It has long been recognized that multi-agent reinforcement learning (MARL) faces significant scalability issues due to the fact that the size of the state and action spaces are exponentially large in the number of agents. In this paper, we identify a
This paper proposes a novel scalable reinforcement learning approach for simultaneous routing and spectrum access in wireless ad-hoc networks. In most previous works on reinforcement learning for network optimization, the network topology is assumed
We study reinforcement learning (RL) in a setting with a network of agents whose states and actions interact in a local manner where the objective is to find localized policies such that the (discounted) global reward is maximized. A fundamental chal
Existing evaluation suites for multi-agent reinforcement learning (MARL) do not assess generalization to novel situations as their primary objective (unlike supervised-learning benchmarks). Our contribution, Melting Pot, is a MARL evaluation suite th
This paper develops an efficient multi-agent deep reinforcement learning algorithm for cooperative controls in powergrids. Specifically, we consider the decentralized inverter-based secondary voltage control problem in distributed generators (DGs), w