Optimal control of robust team stochastic games


الملخص بالإنكليزية

In stochastic dynamic environments, team stochastic games have emerged as a versatile paradigm for studying sequential decision-making problems of fully cooperative multi-agent systems. However, the optimality of the derived policies is usually sensitive to the model parameters, which are typically unknown and required to be estimated from noisy data in practice. To mitigate the sensitivity of the optimal policy to these uncertain parameters, in this paper, we propose a model of robust team stochastic games, where players utilize a robust optimization approach to make decisions. This model extends team stochastic games to the scenario of incomplete information and meanwhile provides an alternative solution concept of robust team optimality. To seek such a solution, we develop a learning algorithm in the form of a Gauss-Seidel modified policy iteration and prove its convergence. This algorithm, compared with robust dynamic programming, not only possesses a faster convergence rate, but also allows for using approximation calculations to alleviate the curse of dimensionality. Moreover, some numerical simulations are presented to demonstrate the effectiveness of the algorithm by generalizing the game model of social dilemmas to sequential robust scenarios.

تحميل البحث