ﻻ يوجد ملخص باللغة العربية
Direct reciprocity is a well-known mechanism that could explain how cooperation emerges and prevails in an evolving population. Numerous prior researches have studied the emergence of cooperation in multiplayer games. However, most of them use numerical or experimental methods, not theoretical analysis. This lack of theoretical works on the evolution of cooperation is due to the high complexity of calculating payoffs. In this paper, we propose a new method, namely, the state-clustering method to calculate the long-term payoffs in repeated games. Using this method, in an $n$-player repeated game, the computing complexity is reduced from $O(2^n)$ to $O(n^2)$, which makes it possible to compute a large-scale repeated games payoff. We explore the evolution of cooperation in both infinitely and finitely repeated public goods games as an example to show the effectiveness of our method. In both cases, we find that when the synergy factor is sufficiently large, the increasing number of participants in a game is detrimental to the evolution of cooperation. Our work provides a theoretical approach to study the evolution of cooperation in repeated multiplayer games.
Since Press and Dysons ingenious discovery of ZD (zero-determinant) strategy in the repeated Prisoners Dilemma game, several studies have confirmed the existence of ZD strategy in repeated multiplayer social dilemmas. However, few researches study th
In repeated interactions between individuals, we do not expect that exactly the same situation will occur from one time to another. Contrary to what is common in models of repeated games in the literature, most real situations may differ a lot and th
Multiplayer games have long been used as testbeds in artificial intelligence research, aptly referred to as the Drosophila of artificial intelligence. Traditionally, researchers have focused on using well-known games to build strong agents. This prog
In a two-stage repeated classical game of prisoners dilemma the knowledge that both players will defect in the second stage makes the players to defect in the first stage as well. We find a quantum version of this repeated game where the players deci
The notion of emph{policy regret} in online learning is a well defined? performance measure for the common scenario of adaptive adversaries, which more traditional quantities such as external regret do not take into account. We revisit the notion of