ترغب بنشر مسار تعليمي؟ اضغط هنا

Towards Resilience for Multi-Agent $QD$-Learning

132   0   0.0 ( 0 )
 نشر من قبل Yijing Xie
 تاريخ النشر 2021
والبحث باللغة English




اسأل ChatGPT حول البحث

This paper considers the multi-agent reinforcement learning (MARL) problem for a networked (peer-to-peer) system in the presence of Byzantine agents. We build on an existing distributed $Q$-learning algorithm, and allow certain agents in the network to behave in an arbitrary and adversarial manner (as captured by the Byzantine attack model). Under the proposed algorithm, if the network topology is $(2F+1)$-robust and up to $F$ Byzantine agents exist in the neighborhood of each regular agent, we establish the almost sure convergence of all regular agents value functions to the neighborhood of the optimal value function of all regular agents. For each state, if the optimal $Q$-values of all regular agents corresponding to different actions are sufficiently separated, our approach allows each regular agent to learn the optimal policy for all regular agents.



قيم البحث

اقرأ أيضاً

In this paper, we study the relationship between resilience and accuracy in the resilient distributed multi-dimensional consensus problem. We consider a network of agents, each of which has a state in $mathbb{R}^d$. Some agents in the network are adv ersarial and can change their states arbitrarily. The normal (non-adversarial) agents interact locally and update their states to achieve consensus at some point in the convex hull $calC$ of their initial states. This objective is achievable if the number of adversaries in the neighborhood of normal agents is less than a specific value, which is a function of the local connectivity and the state dimension $d$. However, to be resilient against adversaries, especially in the case of large $d$, the desired local connectivity is large. We discuss that resilience against adversarial agents can be improved if normal agents are allowed to converge in a bounded region $calBsupseteqcalC$, which means normal agents converge at some point close to but not necessarily inside $calC$ in the worst case. The accuracy of resilient consensus can be measured by the Hausdorff distance between $calB$ and $calC$. As a result, resilience can be improved at the cost of accuracy. We propose a resilient bounded consensus algorithm that exploits the trade-off between resilience and accuracy by projecting $d$-dimensional states into lower dimensions and then solving instances of resilient consensus in lower dimensions. We analyze the algorithm, present various resilience and accuracy bounds, and also numerically evaluate our results.
In this paper, a distributed learning leader-follower consensus protocol based on Gaussian process regression for a class of nonlinear multi-agent systems with unknown dynamics is designed. We propose a distributed learning approach to predict the re sidual dynamics for each agent. The stability of the consensus protocol using the data-driven model of the dynamics is shown via Lyapunov analysis. The followers ultimately synchronize to the leader with guaranteed error bounds by applying the proposed control law with a high probability. The effectiveness and the applicability of the developed protocol are demonstrated by simulation examples.
This study considers the problem of periodic event-triggered (PET) cooperative output regulation for a class of linear multi-agent systems. The advantage of the PET output regulation is that the data transmission and triggered condition are only need ed to be monitored at discrete sampling instants. It is assumed that only a small number of agents can have access to the system matrix and states of the leader. Meanwhile, the PET mechanism is considered not only in the communication between various agents, but also in the sensor-to-controller and controller-to-actuator transmission channels for each agent. The above problem set-up will bring some challenges to the controller design and stability analysis. Based on a novel PET distributed observer, a PET dynamic output feedback control method is developed for each follower. Compared with the existing works, our method can naturally exclude the Zeno behavior, and the inter-event time becomes multiples of the sampling period. Furthermore, for every follower, the minimum inter-event time can be determined textit{a prior}, and computed directly without the knowledge of the leader information. An example is given to verify and illustrate the effectiveness of the new design scheme.
157 - Yutao Tang 2020
This paper investigates an optimal consensus problem for a group of uncertain linear multi-agent systems. All agents are allowed to possess parametric uncertainties that range over an arbitrarily large compact set. The goal is to collectively minimiz e a sum of local costs in a distributed fashion and finally achieve an output consensus on this optimal point using only output information of agents. By adding an optimal signal generator to generate the global optimal point, we convert this problem to several decentralized robust tracking problems. Output feedback integral control is constructively given to achieve an optimal consensus under a mild graph connectivity condition. The efficacy of this control is verified by a numerical example.
Consensusability is an important property for many multi-agent systems (MASs) as it implies the existence of networked controllers driving the states of MAS subsystems to the same value. Consensusability is of interest even when the MAS subsystems ar e physically coupled, which is the case for real-world systems such as power networks. In this paper, we study necessary and sufficient conditions for the consensusability of linear interconnected MASs. These conditions are given in terms of the parameters of the subsystem matrices, as well as the eigenvalues of the physical and communication graph Laplacians. Our results show that weak coupling between subsystems and fast information diffusion rates in the physical and communication graphs favor consensusability. Technical results are verified through computer simulations.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا