ترغب بنشر مسار تعليمي؟ اضغط هنا

Computing Optimal Coarse Correlated Equilibria in Sequential Games

140   0   0.0 ( 0 )
 نشر من قبل Andrea Celli
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We investigate the computation of equilibria in extensive-form games where ex ante correlation is possible, focusing on correlated equilibria requiring the least amount of communication between the players and the mediator. Motivated by the hardness results on the computation of normal-form correlated equilibria, we introduce the notion of normal-form coarse correlated equilibrium, extending the definition of coarse correlated equilibrium to sequential games. We show that, in two-player games without chance moves, an optimal (e.g., social welfare maximizing) normal-form coarse correlated equilibrium can be computed in polynomial time, and that in general multi-player games (including two-player games with Chance), the problem is NP-hard. For the former case, we provide a polynomial-time algorithm based on the ellipsoid method and also propose a more practical one, which can be efficiently applied to problems of considerable size. Then, we discuss how our algorithm can be extended to games with Chance and games with more than two players.



قيم البحث

اقرأ أيضاً

In this paper, we consider the problem of wireless power control in an interference channel where transmitters aim to maximize their own benefit. When the individual payoff or utility function is derived from the transmission efficiency and the spent power, previous works typically study the Nash equilibrium of the resulting power control game. We propose to introduce concepts of correlated and communication equilibria from game theory to find efficient solutions (compared to the Nash equilibrium) for this problem. Communication and correlated equilibria are analyzed for the power control game, and we provide algorithms that can achieve these equilibria. Simulation results demonstrate that the correlation is beneficial under some settings, and the players achieve better payoffs.
We study correlated equilibria and coarse equilibria of simple first-price single-item auctions in the simplest auction model of full information. Nash equilibria are known to always yield full efficiency and a revenue that is at least the second-hig hest value. We prove that the same is true for all correlated equilibria, even those in which agents overbid -- i.e., bid above their values. Coarse equilibria, in contrast, may yield lower efficiency and revenue. We show that the revenue can be as low as 26% of the second-highest value in a coarse equilibrium, even if agents are assumed not to overbid, and this is tight. We also show that when players do not overbid, the worst-case bound on social welfare at coarse equilibrium improves from 63% of the highest value to 81%, and this bound is tight as well.
We study the application of iterative first-order methods to the problem of computing equilibria of large-scale two-player extensive-form games. First-order methods must typically be instantiated with a regularizer that serves as a distance-generatin g function for the decision sets of the players. For the case of two-player zero-sum games, the state-of-the-art theoretical convergence rate for Nash equilibrium is achieved by using the dilated entropy function. In this paper, we introduce a new entropy-based distance-generating function for two-player zero-sum games, and show that this function achieves significantly better strong convexity properties than the dilated entropy, while maintaining the same easily-implemented closed-form proximal mapping. Extensive numerical simulations show that these superior theoretical properties translate into better numerical performance as well. We then generalize our new entropy distance function, as well as general dilated distance functions, to the scaled extension operator. The scaled extension operator is a way to recursively construct convex sets, which generalizes the decision polytope of extensive-form games, as well as the convex polytopes corresponding to correlated and team equilibria. By instantiating first-order methods with our regularizers, we develop the first accelerated first-order methods for computing correlated equilibra and ex-ante coordinated team equilibria. Our methods have a guaranteed $1/T$ rate of convergence, along with linear-time proximal updates.
In this paper, we present exploitability descent, a new algorithm to compute approximate equilibria in two-player zero-sum extensive-form games with imperfect information, by direct policy optimization against worst-case opponents. We prove that when following this optimization, the exploitability of a players strategy converges asymptotically to zero, and hence when both players employ this optimization, the joint policies converge to a Nash equilibrium. Unlike fictitious play (XFP) and counterfactual regret minimization (CFR), our convergence result pertains to the policies being optimized rather than the average policies. Our experiments demonstrate convergence rates comparable to XFP and CFR in four benchmark games in the tabular case. Using function approximation, we find that our algorithm outperforms the tabular version in two of the games, which, to the best of our knowledge, is the first such result in imperfect information games among this class of algorithms.
Despite the many recent practical and theoretical breakthroughs in computational game theory, equilibrium finding in extensive-form team games remains a significant challenge. While NP-hard in the worst case, there are provably efficient algorithms f or certain families of team game. In particular, if the game has common external information, also known as A-loss recall -- informally, actions played by non-team members (i.e., the opposing team or nature) are either unknown to the entire team, or common knowledge within the team -- then polynomial-time algorithms exist (Kaneko and Kline, 1995). In this paper, we devise a completely new algorithm for solving team games. It uses a tree decomposition of the constraint system representing each teams strategy to reduce the number and degree of constraints required for correctness (tightness of the mathematical program). Our algorithm reduces the problem of solving team games to a linear program with at most $NW^{w+O(1)}$ nonzero entries in the constraint matrix, where $N$ is the size of the game tree, $w$ is a parameter that depends on the amount of uncommon external information, and $W$ is the treewidth of the tree decomposition. In public-action games, our program size is bounded by the tighter $tilde O(3^t 2^{t(n-1)}NW)$ for teams of $n$ players with $t$ types each. Since our algorithm describes the polytope of correlated strategies directly, we get equilibrium finding in correlated strategies for free -- instead of, say, having to run a double oracle algorithm. We show via experiments on a standard suite of games that our algorithm achieves state-of-the-art performance on all benchmark game classes except one. We also present, to our knowledge, the first experiments for this setting where more than one team has more than one member.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا