Deep Learning Methods for Mean Field Control Problems with Delay

154 0 0.0 ( 0 )

Download Cite

Added by Jean-Pierre Fouque

Publication date 2019

fields

and research's language is English

Authors Jean-Pierre Fouque - Zhaoyu Zhang

Optimization and Control

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We consider a general class of mean field control problems described by stochastic delayed differential equations of McKean-Vlasov type. Two numerical algorithms are provided based on deep learning techniques, one is to directly parameterize the optimal control using neural networks, the other is based on numerically solving the McKean-Vlasov forward anticipated backward stochastic differential equation (MV-FABSDE) system. In addition, we establish a necessary and sufficient stochastic maximum principle for this class of mean field control problems with delay based on the differential calculus on function of measures, as well as existence and uniqueness results for the associated MV-FABSDE system.

rate research

Unified Reinforcement Q-Learning for Mean Field Game and Control Problems

85 - Andrea Angiuli , Jean-Pierre Fouque , Mathieu Lauri`ere 2020

We present a Reinforcement Learning (RL) algorithm to solve infinite horizon asymptotic Mean Field Game (MFG) and Mean Field Control (MFC) problems. Our approach can be described as a unified two-timescale Mean Field Q-learning: The emph{same} algorithm can learn either the MFG or the MFC solution by simply tuning the ratio of two learning parameters. The algorithm is in discrete time and space where the agent not only provides an action to the environment but also a distribution of the state in order to take into account the mean field feature of the problem. Importantly, we assume that the agent can not observe the populations distribution and needs to estimate it in a model-free manner. The asymptotic MFG and MFC problems are also presented in continuous time and space, and compared with classical (non-asymptotic or stationary) MFG and MFC problems. They lead to explicit solutions in the linear-quadratic (LQ) case that are used as benchmarks for the results of our algorithm.

Optimization and Control Machine Learning Multiagent Systems

Proximal methods for stationary Mean Field Games with local couplings

231 - L.M. Brice~no-Arias , D. Kalise , F.J. Silva 2016

We address the numerical approximation of Mean Field Games with local couplings. For power-like Hamiltonians, we consider both unconstrained and constrained stationary systems with density constraints in order to model hard congestion effects. For finite difference discretizations of the Mean Field Game system, we follow a variational approach. We prove that the aforementioned schemes can be obtained as the optimality system of suitably defined optimization problems. In order to prove the existence of solutions of the scheme with a variational argument, the monotonicity of the coupling term is not used, which allow us to recover general existence results. Next, assuming next that the coupling term is monotone, the variational problem is cast as a convex optimization problem for which we study and compare several proximal type methods. These algorithms have several interesting features, such as global convergence and stability with respect to the viscosity parameter, which can eventually be zero. We assess the performance of the methods via numerical experiments.

Optimization and Control Numerical Analysis

Ergodic behavior of control and mean field games problems depending on acceleration

95 - Pierre Cardaliaguet , Cristian Mendico 2020

The goal of this paper is to study the long time behavior of solutions of the first-order mean field game (MFG) systems with a control on the acceleration. The main issue for this is the lack of small time controllability of the problem, which prevents to define the associated ergodic mean field game problem in the standard way. To overcome this issue, we first study the long-time average of optimal control problems with control on the acceleration: we prove that the time average of the value function converges to an ergodic constant and represent this ergodic constant as a minimum of a Lagrangian over a suitable class of closed probability measure. This characterization leads us to define the ergodic MFG problem as a fixed-point problem on the set of closed probability measures. Then we also show that this MFG ergodic problem has at least one solution, that the associated ergodic constant is unique under the standard mono-tonicity assumption and that the time-average of the value function of the time-dependent MFG problem with control of acceleration converges to this ergodic constant.

Optimization and Control

Mean field control hierarchy

307 - Giacomo Albi , Young-Pil Choi , Massimo Fornasier 2016

In this paper we model the role of a government of a large population as a mean field optimal control problem. Such control problems are constrainted by a PDE of continuity-type, governing the dynamics of the probability distribution of the agent population. We show the existence of mean field optimal controls both in the stochastic and deterministic setting. We derive rigorously the first order optimality conditions useful for numerical computation of mean field optimal controls. We introduce a novel approximating hierarchy of sub-optimal controls based on a Boltzmann approach, whose computation requires a very moderate numerical complexity with respect to the one of the optimal control. We provide numerical experiments for models in opinion formation comparing the behavior of the control hierarchy.

Optimization and Control Analysis of PDEs

Entropy Regularization for Mean Field Games with Learning

144 - Xin Guo , Renyuan Xu , Thaleia Zariphopoulou 2020

Entropy regularization has been extensively adopted to improve the efficiency, the stability, and the convergence of algorithms in reinforcement learning. This paper analyzes both quantitatively and qualitatively the impact of entropy regularization for Mean Field Game (MFG) with learning in a finite time horizon. Our study provides a theoretical justification that entropy regularization yields time-dependent policies and, furthermore, helps stabilizing and accelerating convergence to the game equilibrium. In addition, this study leads to a policy-gradient algorithm for exploration in MFG. Under this algorithm, agents are able to learn the optimal exploration scheduling, with stable and fast convergence to the game equilibrium.

Optimization and Control Machine Learning Machine Learning