No Arabic abstract
We consider a dynamic game with asymmetric information where each player observes privately a noisy version of a (hidden) state of the world V, resulting in dependent private observations. We study structured perfect Bayesian equilibria that use private beliefs in their strategies as sufficient statistics for summarizing their observation history. The main difficulty in finding the appropriate sufficient statistic (state) for the structured strategies arises from the fact that players need to construct (private) beliefs on other players private beliefs on V, which in turn would imply that an infinite hierarchy of beliefs on beliefs needs to be constructed, rendering the problem unsolvable. We show that this is not the case: each players belief on other players beliefs on V can be characterized by her own belief on V and some appropriately defined public belief. We then specialize this setting to the case of a Linear Quadratic Gaussian (LQG) non-zero-sum game and we characterize linear structured PBE that can be found through a backward/forward algorithm akin to dynamic programming for the standard LQG control problem. Unlike the standard LQG problem, however, some of the required quantities for the Kalman filter are observation-dependent and thus cannot be evaluated off-line through a forward recursion.
We consider a non-zero-sum linear quadratic Gaussian (LQG) dynamic game with asymmetric information. Each player observes privately a noisy version of a (hidden) state of the world $V$, resulting in dependent private observations. We study perfect Bayesian equilibria (PBE) for this game with equilibrium strategies that are linear in players private estimates of $V$. The main difficulty arises from the fact that players need to construct estimates on other players estimate on $V$, which in turn would imply that an infinite hierarchy of estimates on estimates needs to be constructed, rendering the problem unsolvable. We show that this is not the case: each players estimate on other players estimates on $V$ can be summarized into her own estimate on $V$ and some appropriately defined public information. Based on this finding we characterize the PBE through a backward/forward algorithm akin to dynamic programming for the standard LQG control problem. Unlike the standard LQG problem, however, Kalman filter covariance matrices, as well as some other required quantities, are observation-dependent and thus cannot be evaluated off-line through a forward recursion.
Extensive games are tools largely used in economics to describe decision processes ofa community of agents. In this paper we propose a formal presentation based on theproof assistant COQ which focuses mostly on infinite extensive games and theircharacteristics. COQ proposes a feature called dependent types, which meansthat the type of an object may depend on the type of its components. For instance,the set of choices or the set of utilities of an agent may depend on the agentherself. Using dependent types, we describe formally a very general class of gamesand strategy profiles, which corresponds somewhat to what game theorists are used to.We also discuss the notions of infiniteness in game theory and how this can beprecisely described.
Today many mobile users in various zones are invited to sense and send back real-time useful information (e.g., traffic observation and sensor data) to keep the freshness of the content updates in such zones. However, due to the sampling cost in sensing and transmission, a user may not have the incentive to contribute the real-time information to help reduce the age of information (AoI). We propose dynamic pricing for each zone to offer age-dependent monetary returns and encourage users to sample information at different rates over time. This dynamic pricing design problem needs to well balance the monetary payments as rewards to users and the AoI evolution over time, and is challenging to solve especially under the incomplete information about users arrivals and their private sampling costs. After formulating the problem as a nonlinear constrained dynamic program, to avoid the curse of dimensionality, we first propose to approximate the dynamic AoI reduction as a time-average term and successfully solve the approximate dynamic pricing in closed-form. Further, by providing the steady-state analysis for an infinite time horizon, we show that the pricing scheme (though in closed-form) can be further simplified to an $varepsilon$-optimal version without recursive computing over time. Finally, we extend the AoI control from a single zone to many zones with heterogeneous user arrival rates and initial ages, where each zone cares not only its own AoI dynamics but also the average AoI of all the zones in a mean field game system to provide a holistic service. Accordingly, we propose decentralized mean field pricing for each zone to self-operate by using a mean field term to estimate the average age dynamics of all the zones, which does not even require many zones to exchange their local data with each other.
We consider an example of stochastic games with partial, asymmetric and non-classical information. We obtain relevant equilibrium policies using a new approach which allows managing the belief updates in a structured manner. Agents have access only to partial information updates, and our approach is to consider optimal open loop control until the information update. The agents continuously control the rates of their Poisson search clocks to acquire the locks, the agent to get all the locks before others would get reward one. However, the agents have no information about the acquisition status of others and will incur a cost proportional to their rate process. We solved the problem for the case with two agents and two locks and conjectured the results for $N$-agents. We showed that a pair of (partial) state-dependent time-threshold policies form a Nash equilibrium.
Decentralized team problems where players have asymmetric information about the state of the underlying stochastic system have been actively studied, but games between such teams are less understood. We consider a general model of zero-sum stochastic games between two competing teams. This model subsumes many previously considered team and zero-sum game models. For this general model, we provide bounds on the upper (min-max) and lower (max-min) values of the game. Furthermore, if the upper and lower values of the game are identical (i.e., if the game has a value), our bounds coincide with the value of the game. Our bounds are obtained using two dynamic programs based on a sufficient statistic known as the common information belief (CIB). We also identify certain information structures in which only the minimizing team controls the evolution of the CIB. In these cases, we show that one of our CIB based dynamic programs can be used to find the min-max strategy (in addition to the min-max value). We propose an approximate dynamic programming approach for computing the values (and the strategy when applicable) and illustrate our results with the help of an example.