ترغب بنشر مسار تعليمي؟ اضغط هنا

Computing Nash Equilibria: Approximation and Smoothed Complexity

77   0   0.0 ( 0 )
 نشر من قبل Shanghua Teng
 تاريخ النشر 2006
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We show that the BIMATRIX game does not have a fully polynomial-time approximation scheme, unless PPAD is in P. In other words, no algorithm with time polynomial in n and 1/epsilon can compute an epsilon-approximate Nash equilibrium of an n by nbimatrix game, unless PPAD is in P. Instrumental to our proof, we introduce a new discrete fixed-point problem on a high-dimensional cube with a constant side-length, such as on an n-dimensional cube with side-length 7, and show that they are PPAD-complete. Furthermore, we prove, unless PPAD is in RP, that the smoothed complexity of the Lemke-Howson algorithm or any algorithm for computing a Nash equilibrium of a bimatrix game is polynomial in n and 1/sigma under perturbations with magnitude sigma. Our result answers a major open question in the smoothed analysis of algorithms and the approximation of Nash equilibria.



قيم البحث

اقرأ أيضاً

We prove that computing a Nash equilibrium of a two-player ($n times n$) game with payoffs in $[-1,1]$ is PPAD-hard (under randomized reductions) even in the smoothed analysis setting, smoothing with noise of constant magnitude. This gives a strong n egative answer to conjectures of Spielman and Teng [ST06] and Cheng, Deng, and Teng [CDT09]. In contrast to prior work proving PPAD-hardness after smoothing by noise of magnitude $1/operatorname{poly}(n)$ [CDT09], our smoothed complexity result is not proved via hardness of approximation for Nash equilibria. This is by necessity, since Nash equilibria can be approximated to constant error in quasi-polynomial time [LMM03]. Our results therefore separate smoothed complexity and hardness of approximation for Nash equilibria in two-player games. The key ingredient in our reduction is the use of a random zero-sum game as a gadget to produce two-player games which remain hard even after smoothing. Our analysis crucially shows that all Nash equilibria of random zero-sum games are far from pure (with high probability), and that this remains true even after smoothing.
The use of monotonicity and Tarskis theorem in existence proofs of equilibria is very widespread in economics, while Tarskis theorem is also often used for similar purposes in the context of verification. However, there has been relatively little in the way of analysis of the complexity of finding the fixed points and equilibria guaranteed by this result. We study a computational formalism based on monotone functions on the $d$-dimensional grid with sides of length $N$, and their fixed points, as well as the closely connected subject of supermodular games and their equilibria. It is known that finding some (any) fixed point of a monotone function can be done in time $log^d N$, and we show it requires at least $log^2 N$ function evaluations already on the 2-dimensional grid, even for randomized algorithms. We show that the general Tarski problem of finding some fixed point, when the monotone function is given succinctly (by a boolean circuit), is in the class PLS of problems solvable by local search and, rather surprisingly, also in the class PPAD. Finding the greatest or least fixed point guaranteed by Tarskis theorem, however, requires $dcdot N$ steps, and is NP-hard in the white box model. For supermodular games, we show that finding an equilibrium in such games is essentially computationally equivalent to the Tarski problem, and finding the maximum or minimum equilibrium is similarly harder. Interestingly, two-player supermodular games where the strategy space of one player is one-dimensional can be solved in $O(log N)$ steps. We also observe that computing (approximating) the value of Condons (Shapleys) stochastic games reduces to the Tarski problem. An important open problem highlighted by this work is proving a $Omega(log^d N)$ lower bound for small fixed dimension $d geq 3$.
Model-free learning for multi-agent stochastic games is an active area of research. Existing reinforcement learning algorithms, however, are often restricted to zero-sum games, and are applicable only in small state-action spaces or other simplified settings. Here, we develop a new data efficient Deep-Q-learning methodology for model-free learning of Nash equilibria for general-sum stochastic games. The algorithm uses a local linear-quadratic expansion of the stochastic game, which leads to analytically solvable optimal actions. The expansion is parametrized by deep neural networks to give it sufficient flexibility to learn the environment without the need to experience all state-action pairs. We study symmetry properties of the algorithm stemming from label-invariant stochastic games and as a proof of concept, apply our algorithm to learning optimal trading strategies in competitive electronic markets.
Given a binary dominance relation on a set of alternatives, a common thread in the social sciences is to identify subsets of alternatives that satisfy certain notions of stability. Examples can be found in areas as diverse as voting theory, game theo ry, and argumentation theory. Brandt and Fischer [BF08] proved that it is NP-hard to decide whether an alternative is contained in some inclusion-minimal upward or downward covering set. For both problems, we raise this lower bound to the Theta_{2}^{p} level of the polynomial hierarchy and provide a Sigma_{2}^{p} upper bound. Relatedly, we show that a variety of other natural problems regarding minimal or minimum-size covering sets are hard or complete for either of NP, coNP, and Theta_{2}^{p}. An important consequence of our results is that neither minimal upward nor minimal downward covering sets (even when guaranteed to exist) can be computed in polynomial time unless P=NP. This sharply contrasts with Brandt and Fischers result that minimal bidirectional covering sets (i.e., sets that are both minimal upward and minimal downward covering sets) are polynomial-time computable.
Generative adversarial networks (GANs) represent a zero-sum game between two machine players, a generator and a discriminator, designed to learn the distribution of data. While GANs have achieved state-of-the-art performance in several benchmark lear ning tasks, GAN minimax optimization still poses great theoretical and empirical challenges. GANs trained using first-order optimization methods commonly fail to converge to a stable solution where the players cannot improve their objective, i.e., the Nash equilibrium of the underlying game. Such issues raise the question of the existence of Nash equilibrium solutions in the GAN zero-sum game. In this work, we show through several theoretical and numerical results that indeed GAN zero-sum games may not have any local Nash equilibria. To characterize an equilibrium notion applicable to GANs, we consider the equilibrium of a new zero-sum game with an objective function given by a proximal operator applied to the original objective, a solution we call the proximal equilibrium. Unlike the Nash equilibrium, the proximal equilibrium captures the sequential nature of GANs, in which the generator moves first followed by the discriminator. We prove that the optimal generative model in Wasserstein GAN problems provides a proximal equilibrium. Inspired by these results, we propose a new approach, which we call proximal training, for solving GAN problems. We discuss several numerical experiments demonstrating the existence of proximal equilibrium solutions in GAN minimax problems.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا