ﻻ يوجد ملخص باللغة العربية
We create an artificial system of agents (attention-based neural networks) which selectively exchange messages with each-other in order to study the emergence of memetic evolution and how memetic evolutionary pressures interact with genetic evolution of the network weights. We observe that the ability of agents to exert selection pressures on each-other is essential for memetic evolution to bootstrap itself into a state which has both high-fidelity replication of memes, as well as continuing production of new memes over time. However, in this system there is very little interaction between this memetic ecology and underlying tasks driving individual fitness - the emergent meme layer appears to be neither helpful nor harmful to agents ability to learn to solve tasks. Sourcecode for these experiments is available at https://github.com/GoodAI/memes
We draw upon a previously largely untapped literature on human collective intelligence as a source of inspiration for improving deep learning. Implicit in many algorithms that attempt to solve Deep Reinforcement Learning (DRL) tasks is the network of
In large-scale multi-agent systems, the large number of agents and complex game relationship cause great difficulty for policy learning. Therefore, simplifying the learning process is an important research issue. In many multi-agent systems, the inte
Intelligent behaviour in the physical world exhibits structure at multiple spatial and temporal scales. Although movements are ultimately executed at the level of instantaneous muscle tensions or joint torques, they must be selected to serve goals de
The MAPF problem is the fundamental problem of planning paths for multiple agents, where the key constraint is that the agents will be able to follow these paths concurrently without colliding with each other. Applications of MAPF include automated w
We present a scalable tree search planning algorithm for large multi-agent sequential decision problems that require dynamic collaboration. Teams of agents need to coordinate decisions in many domains, but naive approaches fail due to the exponential