ترغب بنشر مسار تعليمي؟ اضغط هنا

Bayesian Learning in Dynamic Non-atomic Routing Games

70   0   0.0 ( 0 )
 نشر من قبل Emilien Macault
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

We consider a discrete-time nonatomic routing game with variable demand and uncertain costs. Given a routing network with single origin and destination, the cost function of each edge depends on some uncertain persistent state parameter. At every period, a random traffc demand is routed through the network according to a Bayes-Wardrop equilibrium. The realized costs are publicly observed and the Bayesian belief about the state parameter is updated. We say that there is strong learning when beliefs converge to the truth and weak learning when the equilibrium flow converges to the complete-information flow. We characterize the networks for which learning occurs. We prove that these networks have a series-parallel structure and provide a counterexample to prove that the condition is necessary.

قيم البحث

اقرأ أيضاً

161 - Enxian Chen , Lei Qiao , Xiang Sun 2019
This paper proposes a new equilibrium concept robust perfect equilibrium for non-cooperative games with a continuum of players, incorporating three types of perturbations. Such an equilibrium is shown to exist (in symmetric mixed strategies and in pu re strategies) and satisfy the important properties of admissibility, aggregate robustness, and ex post robust perfection. These properties strengthen relevant equilibrium results in an extensive literature on strategic interactions among a large number of agents. Illustrative applications to congestion games and potential games are presented. In the particular case of a congestion game with strictly increasing cost functions, we show that there is a unique symmetric robust perfect equilibrium.
We add here another layer to the literature on nonatomic anonymous games started with the 1973 paper by Schmeidler. More specifically, we define a new notion of equilibrium which we call $varepsilon$-estimated equilibrium and prove its existence for any positive $varepsilon$. This notion encompasses and brings to nonatomic games recent concepts of equilibrium such as self-confirming, peer-confirming, and Berk--Nash. This augmented scope is our main motivation. At the same time, our approach also resolves some conceptual problems present in Schmeidler (1973), pointed out by Shapley. In that paper the existence of pure-strategy Nash equilibria has been proved for any nonatomic game with a continuum of players, endowed with an atomless countably additive probability. But, requiring Borel measurability of strategy profiles may impose some limitation on players choices and introduce an exogenous dependence among players actions, which clashes with the nature of noncooperative game theory. Our suggested solution is to consider every subset of players as measurable. This leads to a nontrivial purely finitely additive component which might prevent the existence of equilibria and requires a novel mathematical approach to prove the existence of $varepsilon$-equilibria.
In this contribution, the performance of a multi-user system is analyzed in the context of frequency selective fading channels. Using game theoretic tools, a useful framework is provided in order to determine the optimal power allocation when users k now only their own channel (while perfect channel state information is assumed at the base station). We consider the realistic case of frequency selective channels for uplink CDMA. This scenario illustrates the case of decentralized schemes, where limited information on the network is available at the terminal. Various receivers are considered, namely the Matched filter, the MMSE filter and the optimum filter. The goal of this paper is to derive simple expressions for the non-cooperative Nash equilibrium as the number of mobiles becomes large and the spreading length increases. To that end two asymptotic methodologies are combined. The first is asymptotic random matrix theory which allows us to obtain explicit expressions of the impact of all other mobiles on any given tagged mobile. The second is the theory of non-atomic games which computes good approximations of the Nash equilibrium as the number of mobiles grows.
We study an information-structure design problem (a.k.a. persuasion) with a single sender and multiple receivers with actions of a priori unknown types, independently drawn from action-specific marginal distributions. As in the standard Bayesian pers uasion model, the sender has access to additional information regarding the action types, which she can exploit when committing to a (noisy) signaling scheme through which she sends a private signal to each receiver. The novelty of our model is in considering the case where the receivers interact in a sequential game with imperfect information, with utilities depending on the game outcome and the realized action types. After formalizing the notions of ex ante and ex interim persuasiveness (which differ in the time at which the receivers commit to following the senders signaling scheme), we investigate the continuous optimization problem of computing a signaling scheme which maximizes the senders expected revenue. We show that computing an optimal ex ante persuasive signaling scheme is NP-hard when there are three or more receivers. In contrast with previous hardness results for ex interim persuasion, we show that, for games with two receivers, an optimal ex ante persuasive signaling scheme can be computed in polynomial time thanks to a novel algorithm based on the ellipsoid method which we propose.
In this paper, we study large population multi-agent reinforcement learning (RL) in the context of discrete-time linear-quadratic mean-field games (LQ-MFGs). Our setting differs from most existing work on RL for MFGs, in that we consider a non-statio nary MFG over an infinite horizon. We propose an actor-critic algorithm to iteratively compute the mean-field equilibrium (MFE) of the LQ-MFG. There are two primary challenges: i) the non-stationarity of the MFG induces a linear-quadratic tracking problem, which requires solving a backwards-in-time (non-causal) equation that cannot be solved by standard (causal) RL algorithms; ii) Many RL algorithms assume that the states are sampled from the stationary distribution of a Markov chain (MC), that is, the chain is already mixed, an assumption that is not satisfied for real data sources. We first identify that the mean-field trajectory follows linear dynamics, allowing the problem to be reformulated as a linear quadratic Gaussian problem. Under this reformulation, we propose an actor-critic algorithm that allows samples to be drawn from an unmixed MC. Finite-sample convergence guarantees for the algorithm are then provided. To characterize the performance of our algorithm in multi-agent RL, we have developed an error bound with respect to the Nash equilibrium of the finite-population game.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا