ترغب بنشر مسار تعليمي؟ اضغط هنا

Similar to the role of Markov decision processes in reinforcement learning, Stochastic Games (SGs) lay the foundation for the study of multi-agent reinforcement learning (MARL) and sequential agent interactions. In this paper, we derive that computin g an approximate Markov Perfect Equilibrium (MPE) in a finite-state discounted Stochastic Game within the exponential precision is textbf{PPAD}-complete. We adopt a function with a polynomially bounded description in the strategy space to convert the MPE computation to a fixed-point problem, even though the stochastic game may demand an exponential number of pure strategies, in the number of states, for each agent. The completeness result follows the reduction of the fixed-point problem to {sc End of the Line}. Our results indicate that finding an MPE in SGs is highly unlikely to be textbf{NP}-hard unless textbf{NP}=textbf{co-NP}. Our work offers confidence for MARL research to study MPE computation on general-sum SGs and to develop fruitful algorithms as currently on zero-sum SGs.
We study competition among contests in a general model that allows for an arbitrary and heterogeneous space of contest design, where the goal of the contest designers is to maximize the contestants sum of efforts. Our main result shows that optimal c ontests in the monopolistic setting (i.e., those that maximize the sum of efforts in a model with a single contest) form an equilibrium in the model with competition among contests. Under a very natural assumption these contests are in fact dominant, and the equilibria that they form are unique. Moreover, equilibria with the optimal contests are Pareto-optimal even in cases where other equilibria emerge. In many natural cases, they also maximize the social welfare.
261 - Xiaotie Deng , Qi Qi , Amin Saberi 2009
We study the envy-free cake-cutting problem for $d+1$ players with $d$ cuts, for both the oracle function model and the polynomial time function model. For the former, we derive a $theta(({1overepsilon})^{d-1})$ time matching bound for the query comp lexity of $d+1$ player cake cutting with Lipschitz utilities for any $d> 1$. When the utility functions are given by a polynomial time algorithm, we prove the problem to be PPAD-complete. For measurable utility functions, we find a fully polynomial-time algorithm for finding an approximate envy-free allocation of a cake among three people using two cuts.
We show that the BIMATRIX game does not have a fully polynomial-time approximation scheme, unless PPAD is in P. In other words, no algorithm with time polynomial in n and 1/epsilon can compute an epsilon-approximate Nash equilibrium of an n by nbimat rix game, unless PPAD is in P. Instrumental to our proof, we introduce a new discrete fixed-point problem on a high-dimensional cube with a constant side-length, such as on an n-dimensional cube with side-length 7, and show that they are PPAD-complete. Furthermore, we prove, unless PPAD is in RP, that the smoothed complexity of the Lemke-Howson algorithm or any algorithm for computing a Nash equilibrium of a bimatrix game is polynomial in n and 1/sigma under perturbations with magnitude sigma. Our result answers a major open question in the smoothed analysis of algorithms and the approximation of Nash equilibria.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا