ﻻ يوجد ملخص باللغة العربية
In empirical game-theoretic analysis (EGTA), game models are extended iteratively through a process of generating new strategies based on learning from experience with prior strategies. The strategy exploration problem in EGTA is how to direct this process so to construct effective models with minimal iteration. A variety of approaches have been proposed in the literature, including methods based on classic techniques and novel concepts. Comparing the performance of these alternatives can be surprisingly subtle, depending sensitively on criteria adopted and measures employed. We investigate some of the methodological considerations in evaluating strategy exploration, defining key distinctions and identifying a few general principles based on examples and experimental observations. In particular, we emphasize the fact that empirical games create a space of strategies that should be evaluated as a whole. Based on this fact, we suggest that the minimum regret constrained profile (MRCP) provides a particularly robust basis for evaluating a space of strategies, and propose a local search method for MRCP that outperforms previous approaches. However, the computation of MRCP is not always feasible especially in large games. In this scenario, we highlight consistency considerations for comparing across different approaches. Surprisingly, we find that recent works violate these considerations that are necessary for evaluation, which may result in misleading conclusions on the performance of different approaches. For proper evaluation, we propose a new evaluation scheme and demonstrate that our scheme can reveal the true learning performance of different approaches compared to previous evaluation methods.
Trust region methods are widely applied in single-agent reinforcement learning problems due to their monotonic performance-improvement guarantee at every iteration. Nonetheless, when applied in multi-agent settings, the guarantee of trust region meth
Policy-Space Response Oracles (PSRO) is a general algorithmic framework for learning policies in multiagent systems by interleaving empirical game analysis with deep reinforcement learning (Deep RL). At each iteration, Deep RL is invoked to train a b
While many games were designed for steganography and robust watermarking, few focused on reversible watermarking. We present a two-encoder game related to the rate-distortion optimization of content-adaptive reversible watermarking. In the game, Alic
An interaction system has a finite set of agents that interact pairwise, depending on the current state of the system. Symmetric decomposition of the matrix of interaction coefficients yields the representation of states by self-adjoint matrices and
We consider how an agent should update her uncertainty when it is represented by a set P of probability distributions and the agent observes that a random variable X takes on value x, given that the agent makes decisions using the minimax criterion,