ﻻ يوجد ملخص باللغة العربية
Stochastic approximation, a data-driven approach for finding the fixed point of an unknown operator, provides a unified framework for treating many problems in stochastic optimization and reinforcement learning. Motivated by a growing interest in multi-agent and multi-task learning, we consider in this paper a decentralized variant of stochastic approximation. A network of agents, each with their own unknown operator and data observations, cooperatively find the fixed point of the aggregate operator. The agents work by running a local stochastic approximation algorithm using noisy samples from their operators while averaging their iterates with their neighbors on a decentralized communication graph. Our main contribution provides a finite-time analysis of this decentralized stochastic approximation algorithm and characterizes the impacts of the underlying communication topology between agents. Our model for the data observed at each agent is that it is sampled from a Markov processes; this lack of independence makes the iterates biased and (potentially) unbounded. Under mild assumptions on the Markov processes, we show that the convergence rate of the proposed methods is essentially the same as if the samples were independent, differing only by a log factor that represents the mixing time of the Markov process. We also present applications of the proposed method on a number of interesting learning problems in multi-agent systems, including a decentralized variant of Q-learning for solving multi-task reinforcement learning.
Many real-world tasks involve multiple agents with partial observability and limited communication. Learning is challenging in these settings due to local viewpoints of agents, which perceive the world as non-stationary due to concurrently-exploring
One of the challenges for multi-agent reinforcement learning (MARL) is designing efficient learning algorithms for a large system in which each agent has only limited or partial information of the entire system. In this system, it is desirable to lea
Multi-Agent Reinforcement Learning (MARL) is a challenging subarea of Reinforcement Learning due to the non-stationarity of the environments and the large dimensionality of the combined action space. Deep MARL algorithms have been applied to solve di
The Mixture-of-experts (MoE) architecture is showing promising results in multi-task learning (MTL) and in scaling high-capacity neural networks. State-of-the-art MoE models use a trainable sparse gate to select a subset of the experts for each input
Motivated by the emerging use of multi-agent reinforcement learning (MARL) in engineering applications such as networked robotics, swarming drones, and sensor networks, we investigate the policy evaluation problem in a fully decentralized setting, us