New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications

78 0 0.0 ( 0 )

Download Cite

Added by Sarah Perrin

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Sarah Perrin - Julien Perolat - Mathieu Lauri`ere

Optimization and Control Artificial Intelligence

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In this paper, we deepen the analysis of continuous time Fictitious Play learning algorithm to the consideration of various finite state Mean Field Game settings (finite horizon, $gamma$-discounted), allowing in particular for the introduction of an additional common noise. We first present a theoretical convergence analysis of the continuous time Fictitious Play process and prove that the induced exploitability decreases at a rate $O(frac{1}{t})$. Such analysis emphasizes the use of exploitability as a relevant metric for evaluating the convergence towards a Nash equilibrium in the context of Mean Field Games. These theoretical contributions are supported by numerical experiments provided in either model-based or model-free settings. We provide hereby for the first time converging learning dynamics for Mean Field Games in the presence of common noise.

rate research

Convergence of Deep Fictitious Play for Stochastic Differential Games

135 - Jiequn Han , Ruimeng Hu , Jihao Long 2020

Stochastic differential games have been used extensively to model agents competitions in Finance, for instance, in P2P lending platforms from the Fintech industry, the banking system for systemic risk, and insurance markets. The recently proposed machine learning algorithm, deep fictitious play, provides a novel efficient tool for finding Markovian Nash equilibrium of large $N$-player asymmetric stochastic differential games [J. Han and R. Hu, Mathematical and Scientific Machine Learning Conference, pages 221-245, PMLR, 2020]. By incorporating the idea of fictitious play, the algorithm decouples the game into $N$ sub-optimization problems, and identifies each players optimal strategy with the deep backward stochastic differential equation (BSDE) method parallelly and repeatedly. In this paper, we prove the convergence of deep fictitious play (DFP) to the true Nash equilibrium. We can also show that the strategy based on DFP forms an $eps$-Nash equilibrium. We generalize the algorithm by proposing a new approach to decouple the games, and present numerical results of large population games showing the empirical convergence of the algorithm beyond the technical assumptions in the theorems.

Optimization and Control Computer Science and Game Theory Machine Learning

A Computationally Efficient Implementation of Fictitious Play for Large-Scale Games

175 - B. Swenson , S. Kar , 2015

The paper is concerned with distributed learning and optimization in large-scale settings. The well-known Fictitious Play (FP) algorithm has been shown to achieve Nash equilibrium learning in certain classes of multi-agent games. However, FP can be computationally difficult to implement when the number of players is large. Sampled FP is a variant of FP that mitigates the computational difficulties arising in FP by using a Monte-Carlo (i.e., sampling-based) approach. The Sampled FP algorithm has been studied both as a tool for distributed learning and as an optimization heuristic for large-scale problems. Despite its computational advantages, a shortcoming of Sampled FP is that the number of samples that must be drawn in each round of the algorithm grows without bound (on the order of $sqrt{t}$, where $t$ is the round of the repeated play). In this paper we propose Computationally Efficient Sampled FP (CESFP)---a variant of Sampled FP in which only one sample need be drawn each round of the algorithm (a substantial reduction from $O(sqrt{t})$ samples per round, as required in Sampled FP). CESFP operates using a stochastic-approximation type rule to estimate the expected utility from round to round. It is proven that the CESFP algorithm achieves Nash equilibrium learning in the same sense as classical FP and Sampled FP. Simulation results suggest that the convergence rate of CESFP (in terms of repeated-play iterations) is similar to that of Sampled FP.

Optimization and Control Computer Science and Game Theory Probability

(Local) Non-Asymptotic Analysis of Logistic Fictitious Play for Two-Player Zero-Sum Games and Its Deterministic Variant

97 - Renbo Zhao , Qiuyun Zhu 2021

We conduct a local non-asymptotic analysis of the logistic fictitious play (LFP) algorithm, and show that with high probability, this algorithm converges locally at rate $O(1/t)$. To achieve this, we first develop a global non-asymptotic analysis of the deterministic variant of LFP, which we call DLFP, and derive a class of convergence rates based on different step-sizes. We then incorporate a particular form of stochastic noise to the analysis of DLFP, and obtain the local convergence rate of LFP. As a result of independent interest, we extend DLFP to solve a class of strongly convex composite optimization problems. We show that although the resulting algorithm is a simple variant of the generalized Frank-Wolfe method in Nesterov [1,Section 5], somewhat surprisingly, it enjoys significantly improved convergence rate.

Optimization and Control Probability

Empirical Centroid Fictitious Play: An Approach For Distributed Learning In Multi-Agent Games

128 - Brian Swenson , Soummya Kar , 2013

The paper is concerned with distributed learning in large-scale games. The well-known fictitious play (FP) algorithm is addressed, which, despite theoretical convergence results, might be impractical to implement in large-scale settings due to intense computation and communication requirements. An adaptation of the FP algorithm, designated as the empirical centroid fictitious play (ECFP), is presented. In ECFP players respond to the centroid of all players actions rather than track and respond to the individual actions of every player. Convergence of the ECFP algorithm in terms of average empirical frequency (a notion made precise in the paper) to a subset of the Nash equilibria is proven under the assumption that the game is a potential game with permutation invariant potential function. A more general formulation of ECFP is then given (which subsumes FP as a special case) and convergence results are given for the class of potential games. Furthermore, a distributed formulation of the ECFP algorithm is presented, in which, players endowed with a (possibly sparse) preassigned communication graph, engage in local, non-strategic information exchange to eventually agree on a common equilibrium. Convergence results are proven for the distributed ECFP algorithm.

Optimization and Control Computer Science and Game Theory Systems and Control

Reinforcement Learning for Mean Field Games, with Applications to Economics

369 - Andrea Angiuli , Jean-Pierre Fouque , Mathieu Lauriere 2021

Mean field games (MFG) and mean field control problems (MFC) are frameworks to study Nash equilibria or social optima in games with a continuum of agents. These problems can be used to approximate competitive or cooperative games with a large finite number of agents and have found a broad range of applications, in particular in economics. In recent years, the question of learning in MFG and MFC has garnered interest, both as a way to compute solutions and as a way to model how large populations of learners converge to an equilibrium. Of particular interest is the setting where the agents do not know the model, which leads to the development of reinforcement learning (RL) methods. After reviewing the literature on this topic, we present a two timescale approach with RL for MFG and MFC, which relies on a unified Q-learning algorithm. The main novelty of this method is to simultaneously update an action-value function and a distribution but with different rates, in a model-free fashion. Depending on the ratio of the two learning rates, the algorithm learns either the MFG or the MFC solution. To illustrate this method, we apply it to a mean field problem of accumulated consumption in finite horizon with HARA utility function, and to a traders optimal liquidation problem.

Optimization and Control Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions