ترغب بنشر مسار تعليمي؟ اضغط هنا

69 - Brian Swenson , Soummya Kar , 2015
The paper studies the highly prototypical Fictitious Play (FP) algorithm, as well as a broad class of learning processes based on best-response dynamics, that we refer to as FP-type algorithms. A well-known shortcoming of FP is that, while players ma y learn an equilibrium strategy in some abstract sense, there are no guarantees that the period-by-period strategies generated by the algorithm actually converge to equilibrium themselves. This issue is fundamentally related to the discontinuous nature of the best response correspondence and is inherited by many FP-type algorithms. Not only does it cause problems in the interpretation of such algorithms as a mechanism for economic and social learning, but it also greatly diminishes the practical value of these algorithms for use in distributed control. We refer to forms of learning in which players learn equilibria in some abstract sense only (to be defined more precisely in the paper) as weak learning, and we refer to forms of learning where players period-by-period strategies converge to equilibrium as strong learning. An approach is presented for modifying an FP-type algorithm that achieves weak learning in order to construct a variant that achieves strong learning. Theoretical convergence results are proved.
Empirical Centroid Fictitious Play (ECFP) is a generalization of the well-known Fictitious Play (FP) algorithm designed for implementation in large-scale games. In ECFP, the set of players is subdivided into equivalence classes with players in the sa me class possessing similar properties. Players choose a next-stage action by tracking and responding to aggregate statistics related to each equivalence class. This setup alleviates the difficult task of tracking and responding to the statistical behavior of every individual player, as is the case in traditional FP. Aside from ECFP, many useful modifications have been proposed to classical FP, e.g., rules allowing for network-based implementation, increased computational efficiency, and stronger forms of learning. Such modifications tend to be of great practical value; however, their effectiveness relies heavily on two fundamental properties of FP: robustness to alterations in the empirical distribution step size process, and robustness to best-response perturbations. The main contribution of the paper is to show that similar robustness properties also hold for the ECFP algorithm. This result serves as a first step in enabling practical modifications to ECFP, similar to those already developed for FP.
74 - Brian Swenson , Soummya Kar , 2013
The paper is concerned with distributed learning in large-scale games. The well-known fictitious play (FP) algorithm is addressed, which, despite theoretical convergence results, might be impractical to implement in large-scale settings due to intens e computation and communication requirements. An adaptation of the FP algorithm, designated as the empirical centroid fictitious play (ECFP), is presented. In ECFP players respond to the centroid of all players actions rather than track and respond to the individual actions of every player. Convergence of the ECFP algorithm in terms of average empirical frequency (a notion made precise in the paper) to a subset of the Nash equilibria is proven under the assumption that the game is a potential game with permutation invariant potential function. A more general formulation of ECFP is then given (which subsumes FP as a special case) and convergence results are given for the class of potential games. Furthermore, a distributed formulation of the ECFP algorithm is presented, in which, players endowed with a (possibly sparse) preassigned communication graph, engage in local, non-strategic information exchange to eventually agree on a common equilibrium. Convergence results are proven for the distributed ECFP algorithm.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا