ترغب بنشر مسار تعليمي؟ اضغط هنا

Eliciting Information with Partial Signals in Repeated Games

152   0   0.0 ( 0 )
 نشر من قبل Yutong Wu
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We consider an information elicitation game where the center needs the agent to self-report her actual usage of a service and charges her a payment accordingly. The center can only observe a partial signal, representing part of the agents true consumption, that is generated randomly from a publicly known distribution. The agent can report any information, as long as it does not contradict the signal, and the center issues a payment based on the reported information. Such problems find application in prosumer pricing, tax filing, etc., when the agents actual consumption of a service is masked from the center and verification of the submitted reports is impractical. The key difference between the current problem and classic information elicitation problems is that the agent gets to observe the full signal and act strategically, but the center can only see the partial signal. For this seemingly impossible problem, we propose a penalty mechanism that elicits truthful self-reports in a repeated game. In particular, besides charging the agent the reported value, the mechanism charges a penalty proportional to her inconsistent reports. We show how a combination of the penalty rate and the length of the game incentivizes the agent to be truthful for the entire game, a phenomenon we call fear of tomorrow verification. We show how approximate results for arbitrary distributions can be obtained by analyzing Bernoulli distributions. We extend our mechanism to a multi-agent cost sharing setting and give equilibrium results.



قيم البحث

اقرأ أيضاً

We consider an example of stochastic games with partial, asymmetric and non-classical information. We obtain relevant equilibrium policies using a new approach which allows managing the belief updates in a structured manner. Agents have access only t o partial information updates, and our approach is to consider optimal open loop control until the information update. The agents continuously control the rates of their Poisson search clocks to acquire the locks, the agent to get all the locks before others would get reward one. However, the agents have no information about the acquisition status of others and will incur a cost proportional to their rate process. We solved the problem for the case with two agents and two locks and conjectured the results for $N$-agents. We showed that a pair of (partial) state-dependent time-threshold policies form a Nash equilibrium.
This paper considers repeated games in which one player has more information about the game than the other players. In particular, we investigate repeated two-player zero-sum games where only the column player knows the payoff matrix A of the game. S uppose that while repeatedly playing this game, the row player chooses her strategy at each round by using a no-regret algorithm to minimize her (pseudo) regret. We develop a no-instant-regret algorithm for the column player to exhibit last round convergence to a minimax equilibrium. We show that our algorithm is efficient against a large set of popular no-regret algorithms of the row player, including the multiplicative weight update algorithm, the online mirror descent method/follow-the-regularized-leader, the linear multiplicative weight update algorithm, and the optimistic multiplicative weight update.
We define the notion of Bayes correlated Wardrop equilibrium for general nonatomic games with anonymous players and incomplete information. Bayes correlated Wardrop equilibria describe the set of equilibrium outcomes when a mediator, such as a traffi c information system, provides information to the players. We relate this notion to Bayes Wardrop equilibrium. Then, we provide conditions -- existence of a convex potential and complete information -- under which mediation does not improve equilibrium outcomes. We then study full implementation and, finally, information design in anonymous games with a finite set of players, when the number of players tends to infinity.
We consider extensive games with perfect information with well-founded game trees and study the problems of existence and of characterization of the sets of subgame perfect equilibria in these games. We also provide such characterizations for two cla sses of these games in which subgame perfect equilibria exist: two-player zero-sum games with, respectively, two and three outcomes.
The combination of deep reinforcement learning and search at both training and test time is a powerful paradigm that has led to a number of successes in single-agent settings and perfect-information games, best exemplified by AlphaZero. However, prio r algorithms of this form cannot cope with imperfect-information games. This paper presents ReBeL, a general framework for self-play reinforcement learning and search that provably converges to a Nash equilibrium in any two-player zero-sum game. In the simpler setting of perfect-information games, ReBeL reduces to an algorithm similar to AlphaZero. Results in two different imperfect-information games show ReBeL converges to an approximate Nash equilibrium. We also show ReBeL achieves superhuman performance in heads-up no-limit Texas holdem poker, while using far less domain knowledge than any prior poker AI.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا