No Arabic abstract
A Multi-Agent System is a distributed system where the agents or nodes perform complex functions that cannot be written down in analytic form. Multi-Agent Systems are highly connected, and the information they contain is mostly stored in the connections. When agents update their state, they take into account the state of the other agents, and they have access to those states via the connections. There is also external, user-generated input into the Multi-Agent System. As so much information is stored in the connections, agents are often memory-less. This memory-less property, together with the randomness of the external input, has allowed us to model Multi-Agent Systems using Markov chains. In this paper, we look at Multi-Agent Systems that evolve, i.e. the number of agents varies according to the fitness of the individual agents. We extend our Markov chain model, and define stability. This is the start of a methodology to control Multi-Agent Systems. We then build upon this to construct an entropy-based definition for the degree of instability (entropy of the limit probabilities), which we used to perform a stability analysis. We then investigated the stability of evolving agent populations through simulation, and show that the results are consistent with the original definition of stability in non-evolving Multi-Agent Systems, proposed by Chli and De Wilde. This paper forms the theoretical basis for the construction of Digital Business Ecosystems, and applications have been reported elsewhere.
Humanity faces numerous problems of common-pool resource appropriation. This class of multi-agent social dilemma includes the problems of ensuring sustainable use of fresh water, common fisheries, grazing pastures, and irrigation systems. Abstract models of common-pool resource appropriation based on non-cooperative game theory predict that self-interested agents will generally fail to find socially positive equilibria---a phenomenon called the tragedy of the commons. However, in reality, human societies are sometimes able to discover and implement stable cooperative solutions. Decades of behavioral game theory research have sought to uncover aspects of human behavior that make this possible. Most of that work was based on laboratory experiments where participants only make a single choice: how much to appropriate. Recognizing the importance of spatial and temporal resource dynamics, a recent trend has been toward experiments in more complex real-time video game-like environments. However, standard methods of non-cooperative game theory can no longer be used to generate predictions for this case. Here we show that deep reinforcement learning can be used instead. To that end, we study the emergent behavior of groups of independently learning agents in a partially observed Markov game modeling common-pool resource appropriation. Our experiments highlight the importance of trial-and-error learning in common-pool resource appropriation and shed light on the relationship between exclusion, sustainability, and inequality.
We consider the problem of detecting norm violations in open multi-agent systems (MAS). We show how, using ideas from scrip systems, we can design mechanisms where the agents comprising the MAS are incentivised to monitor the actions of other agents for norm violations. The cost of providing the incentives is not borne by the MAS and does not come from fines charged for norm violations (fines may be impossible to levy in a system where agents are free to leave and rejoin again under a different identity). Instead, monitoring incentives come from (scrip) fees for accessing the services provided by the MAS. In some cases, perfect monitoring (and hence enforcement) can be achieved: no norms will be violated in equilibrium. In other cases, we show that, while it is impossible to achieve perfect enforcement, we can get arbitrarily close; we can make the probability of a norm violation in equilibrium arbitrarily small. We show using simulations that our theoretical results hold for multi-agent systems with as few as 1000 agents---the system rapidly converges to the steady-state distribution of scrip tokens necessary to ensure monitoring and then remains close to the steady state.
We present Distributed Simplex Architecture (DSA), a new runtime assurance technique that provides safety guarantees for multi-agent systems (MASs). DSA is inspired by the Simplex control architecture of Sha et al., but with some significant differences. The traditional Simplex approach is limited to single-agent systems or a MAS with a centralized control scheme. DSA addresses this limitation by extending the scope of Simplex to include MASs under distributed control. In DSA, each agent has a local instance of traditional Simplex such that the preservation of safety in the local instances implies safety for the entire MAS. We provide a proof of safety for DSA, and present experimental results for several case studies, including flocking with collision avoidance, safe navigation of ground rovers through way-points, and the safe operation of a microgrid.
In this work, we study emergent communication through the lens of cooperative multi-agent behavior in nature. Using insights from animal communication, we propose a spectrum from low-bandwidth (e.g. pheromone trails) to high-bandwidth (e.g. compositional language) communication that is based on the cognitive, perceptual, and behavioral capabilities of social agents. Through a series of experiments with pursuit-evasion games, we identify multi-agent reinforcement learning algorithms as a computational model for the low-bandwidth end of the communication spectrum.
Consensus strategies find a variety of applications in distributed coordination and decision making in multi-agent systems. In particular, average consensus plays a key role in a number of applications and is closely associated with two classes of digraphs, weight-balanced (for continuous-time systems) and bistochastic (for discrete-time systems). A weighted digraph is called balanced if, for each node, the sum of the weights of the edges outgoing from that node is equal to the sum of the weights of the edges incoming to that node. In addition, a weight-balanced digraph is bistochastic if all weights are nonnegative and, for each node, the sum of weights of edges incoming to that node and the sum of the weights of edges out-going from that node is unity; this implies that the corresponding weight matrix is column and row stochastic (i.e., doubly stochastic). We propose two distributed algorithms: one solves the weight-balance problem and the other solves the bistochastic matrix formation problem for a distributed system whose components (nodes) can exchange information via interconnection links (edges) that form an arbitrary, possibly directed, strongly connected communication topology (digraph). Both distributed algorithms achieve their goals asymptotically and operate iteratively by having each node adapt the (nonnegative) weights on its outgoing edges based on the weights of its incoming links (i.e., based on purely local information). We also provide examples to illustrate the operation, performance, and potential advantages of the proposed algorithms.