No Arabic abstract
Intelligent behaviour in the physical world exhibits structure at multiple spatial and temporal scales. Although movements are ultimately executed at the level of instantaneous muscle tensions or joint torques, they must be selected to serve goals defined on much longer timescales, and in terms of relations that extend far beyond the body itself, ultimately involving coordination with other agents. Recent research in artificial intelligence has shown the promise of learning-based approaches to the respective problems of complex movement, longer-term planning and multi-agent coordination. However, there is limited research aimed at their integration. We study this problem by training teams of physically simulated humanoid avatars to play football in a realistic virtual environment. We develop a method that combines imitation learning, single- and multi-agent reinforcement learning and population-based training, and makes use of transferable representations of behaviour for decision making at different levels of abstraction. In a sequence of stages, players first learn to control a fully articulated body to perform realistic, human-like movements such as running and turning; they then acquire mid-level football skills such as dribbling and shooting; finally, they develop awareness of others and play as a team, bridging the gap between low-level motor control at a timescale of milliseconds, and coordinated goal-directed behaviour as a team at the timescale of tens of seconds. We investigate the emergence of behaviours at different levels of abstraction, as well as the representations that underlie these behaviours using several analysis techniques, including statistics from real-world sports analytics. Our work constitutes a complete demonstration of integrated decision-making at multiple scales in a physically embodied multi-agent setting. See project video at https://youtu.be/KHMwq9pv7mg.
We create an artificial system of agents (attention-based neural networks) which selectively exchange messages with each-other in order to study the emergence of memetic evolution and how memetic evolutionary pressures interact with genetic evolution of the network weights. We observe that the ability of agents to exert selection pressures on each-other is essential for memetic evolution to bootstrap itself into a state which has both high-fidelity replication of memes, as well as continuing production of new memes over time. However, in this system there is very little interaction between this memetic ecology and underlying tasks driving individual fitness - the emergent meme layer appears to be neither helpful nor harmful to agents ability to learn to solve tasks. Sourcecode for these experiments is available at https://github.com/GoodAI/memes
With the vast amount of data collected on football and the growth of computing abilities, many games involving decision choices can be optimized. The underlying rule is the maximization of an expected utility of outcomes and the law of large numbers. The data available allows us to compute with high accuracy the probabilities of outcomes of decisions and the well defined points system in the game allows us to have the necessary terminal utilities. With some well established theory we can then optimize choices at a single play level.
The human immune system has numerous properties that make it ripe for exploitation in the computational domain, such as robustness and fault tolerance, and many different algorithms, collectively termed Artificial Immune Systems (AIS), have been inspired by it. Two generations of AIS are currently in use, with the first generation relying on simplified immune models and the second generation utilising interdisciplinary collaboration to develop a deeper understanding of the immune system and hence produce more complex models. Both generations of algorithms have been successfully applied to a variety of problems, including anomaly detection, pattern recognition, optimisation and robotics. In this chapter an overview of AIS is presented, its evolution is discussed, and it is shown that the diversification of the field is linked to the diversity of the immune system itself, leading to a number of algorithms as opposed to one archetypal system. Two case studies are also presented to help provide insight into the mechanisms of AIS; these are the idiotypic network approach and the Dendritic Cell Algorithm.
A lot of efforts have been devoted to investigating how agents can learn effectively and achieve coordination in multiagent systems. However, it is still challenging in large-scale multiagent settings due to the complex dynamics between the environment and agents and the explosion of state-action space. In this paper, we design a novel Dynamic Multiagent Curriculum Learning (DyMA-CL) to solve large-scale problems by starting from learning on a multiagent scenario with a small size and progressively increasing the number of agents. We propose three transfer mechanisms across curricula to accelerate the learning process. Moreover, due to the fact that the state dimension varies across curricula,, and existing network structures cannot be applied in such a transfer setting since their network input sizes are fixed. Therefore, we design a novel network structure called Dynamic Agent-number Network (DyAN) to handle the dynamic size of the network input. Experimental results show that DyMA-CL using DyAN greatly improves the performance of large-scale multiagent learning compared with state-of-the-art deep reinforcement learning approaches. We also investigate the influence of three transfer mechanisms across curricula through extensive simulations.
The application of Network Science to social systems has introduced new methodologies to analyze classical problems such as the emergence of epidemics, the arousal of cooperation between individuals or the propagation of information along social networks. More recently, the organization of football teams and their performance have been unveiled using metrics coming from Network Science, where a team is considered as a complex network whose nodes (i.e., players) interact with the aim of overcoming the opponent network. Here, we combine the use of different network metrics to extract the particular signature of the F.C. Barcelona coached by Guardiola, which has been considered one of the best teams along football history. We have first compared the network organization of Guardiolas team with their opponents along one season of the Spanish national league, identifying those metrics with statistically significant differences and relating them with the Guardiolas game. Next, we have focused on the temporal nature of football passing networks and calculated the evolution of all network properties along a match, instead of considering their average. In this way, we are able to identify those network metrics that enhance the probability of scoring/receiving a goal, showing that not all teams behave in the same way and how the organization Guardiolas F.C. Barcelona is different from the rest, including its clustering coefficient, shortest-path length, largest eigenvalue of the adjacency matrix, algebraic connectivity and centrality distribution.