No Arabic abstract
Two-player games have had a long and fruitful history of applications stretching across the social, biological, and physical sciences. Most applications of two-player games assume synchronous decisions or moves even when the games are iterated. But different strategies may emerge as preferred when the decisions or moves are sequential, or the games are iterated. Zero-determinant strategies developed by Press and Dyson are a new class of strategies that have been developed for synchronous two-player games, most notably the iterated prisoners dilemma. Here we apply the Press-Dyson analysis to sequential or asynchronous two-player games. We focus on the asynchronous prisoners dilemma. As a first application of the Press-Dyson analysis of the asynchronous prisoners dilemma, tit-for-tat is shown to be an efficient defense against extortionate zero-determinant strategies. Nice strategies like tit-for-tat are also shown to lead to Pareto optimal payoffs for both players in repeated prisoners dilemma.
We study a spatial, one-shot prisoners dilemma (PD) model in which selection operates on both an organisms behavioral strategy (cooperate or defect) and its choice of when to implement that strategy across a set of discrete time slots. Cooperators evolve to fixation regularly in the model when we add time slots to lattices and small-world networks, and their portion of the population grows, albeit slowly, when organisms interact in a scale-free network. This selection for cooperators occurs across a wide variety of time slots and it does so even when a crucial condition for the evolution of cooperation on graphs is violated--namely, when the ratio of benefits to costs in the PD does not exceed the number of spatially-adjacent organisms.
We study the evolution of cooperation in populations where individuals play prisoners dilemma on a network. Every node of the network corresponds on an individual choosing whether to cooperate or defect in a repeated game. The players revise their actions by imitating those neighbors who have higher payoffs. We show that when the interactions take place on graphs with large girth, cooperation is more likely to emerge. On the flip side, in graphs with many cycles of length 3 and 4, defection spreads more rapidly. One of the key ideas of our analysis is that our dynamics can be seen as a perturbation of the voter model. We write the transition kernel of the corresponding Markov chain in terms of the pairwise correlations in the voter model. We analyze the pairwise correlation and show that in graphs with relatively large girth, cooperators cluster and help each other.
The conventional wisdom is that scale-free networks are prone to cooperation spreading. In this paper we investigate the cooperative behaviors on the structured scale-free network. On the contrary of the conventional wisdom that scale-free networks are prone to cooperation spreading, the evolution of cooperation is inhibited on the structured scale-free network while performing the prisoners dilemma (PD) game. Firstly, we demonstrate that neither the scale-free property nor the high clustering coefficient is responsible for the inhibition of cooperation spreading on the structured scale-free network. Then we provide one heuristic method to argue that the lack of age correlations and its associated `large-world behavior in the structured scale-free network inhibit the spread of cooperation. The findings may help enlighten further studies on evolutionary dynamics of the PD game in scale-free networks.
The paradox of cooperation among selfish individuals still puzzles scientific communities. Although a large amount of evidence has demonstrated that cooperator clusters in spatial games are effective to protect cooperators against the invasion of defectors, we continue to lack the condition for the formation of a giant cooperator cluster that assures the prevalence of cooperation in a system. Here, we study the dynamical organization of cooperator clusters in spatial prisoners dilemma game to offer the condition for the dominance of cooperation, finding that a phase transition characterized by the emergence of a large spanning cooperator cluster occurs when the initial fraction of cooperators exceeds a certain threshold. Interestingly, the phase transition belongs to different universality classes of percolation determined by the temptation to defect $b$. Specifically, on square lattices, $1<b<4/3$ leads to a phase transition pertaining to the class of regular site percolation, whereas $3/2<b<2$ gives rise to a phase transition subject to invasion percolation with trapping. Our findings offer deeper understanding of the cooperative behaviors in nature and society.
The Iterated Prisoners Dilemma has guided research on social dilemmas for decades. However, it distinguishes between only two atomic actions: cooperate and defect. In real-world prisoners dilemmas, these choices are temporally extended and different strategies may correspond to sequences of actions, reflecting grades of cooperation. We introduce a Sequential Prisoners Dilemma (SPD) game to better capture the aforementioned characteristics. In this work, we propose a deep multiagent reinforcement learning approach that investigates the evolution of mutual cooperation in SPD games. Our approach consists of two phases. The first phase is offline: it synthesizes policies with different cooperation degrees and then trains a cooperation degree detection network. The second phase is online: an agent adaptively selects its policy based on the detected degree of opponent cooperation. The effectiveness of our approach is demonstrated in two representative SPD 2D games: the Apple-Pear game and the Fruit Gathering game. Experimental results show that our strategy can avoid being exploited by exploitative opponents and achieve cooperation with cooperative opponents.