ﻻ يوجد ملخص باللغة العربية
Solving a reinforcement learning problem typically involves correctly prespecifying the reward signal from which the algorithm learns. Here, we approach the problem of reward signal design by using an evolutionary approach to perform a search on the space of all possible reward signals. We introduce a general framework for optimizing $N$ goals given $n$ reward signals. Through experiments we demonstrate that such an approach allows agents to learn high-level goals - such as winning, losing and cooperating - from scratch without prespecified reward signals in the game of Pong. Some of the solutions found by the algorithm are surprising, in the sense that they would probably not have been chosen by a person trying to hand-code a given behaviour through a specific reward signal. Furthermore, it seems that the proposed approach may also benefit from higher stability of the training performance when compared with the typical score-based reward signals.
Many real-world scenarios involve teams of agents that have to coordinate their actions to reach a shared goal. We focus on the setting in which a team of agents faces an opponent in a zero-sum, imperfect-information game. Team members can coordinate
In many real-world problems, a team of agents need to collaborate to maximize the common reward. Although existing works formulate this problem into a centralized learning with decentralized execution framework, which avoids the non-stationary proble
In recent years, Win-Stay-Lose-Learn rule has attracted wide attention as an effective strategy updating rule, and voluntary participation is proposed by introducing a third strategy in Prisoners dilemma game. Some researches show that combining Win-
Evolutionary game theory assumes that players replicate a highly scored players strategy through genetic inheritance. However, when learning occurs culturally, it is often difficult to recognize someones strategy just by observing the behaviour. In t
Prisoners dilemma game is the most commonly used model of spatial evolutionary game which is considered as a paradigm to portray competition among selfish individuals. In recent years, Win-Stay-Lose-Learn, a strategy updating rule base on aspiration,