ترغب بنشر مسار تعليمي؟ اضغط هنا

Risk-Averse Equilibrium for Games

123   0   0.0 ( 0 )
 نشر من قبل Ali Yekkehkhany
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

The term rational has become synonymous with maximizing expected payoff in the definition of the best response in Nash setting. In this work, we consider stochastic games in which players engage only once, or at most a limited number of times. In such games, it may not be rational for players to maximize their expected payoff as they cannot wait for the Law of Large Numbers to take effect. We instead define a new notion of a risk-averse best response, that results in a risk-averse equilibrium (RAE) in which players choose to play the strategy that maximizes the probability of them being rewarded the most in a single round of the game rather than maximizing the expected received reward, subject to the actions of other players. We prove the risk-averse equilibrium to exist in all finite games and numerically compare its performance to Nash equilibrium in finite-time stochastic games.

قيم البحث

اقرأ أيضاً

The fast-growing market of autonomous vehicles, unmanned aerial vehicles, and fleets in general necessitates the design of smart and automatic navigation systems considering the stochastic latency along different paths in the traffic network. The lon gstanding shortest path problem in a deterministic network, whose counterpart in a congestion game setting is Wardrop equilibrium, has been studied extensively, but it is well known that finding the notion of an optimal path is challenging in a traffic network with stochastic arc delays. In this work, we propose three classes of risk-averse equilibria for an atomic stochastic congestion game in its general form where the arc delay distributions are load dependent and not necessarily independent of each other. The three classes are risk-averse equilibrium (RAE), mean-variance equilibrium (MVE), and conditional value at risk level $alpha$ equilibrium (CVaR$_alpha$E) whose notions of risk-averse best responses are based on maximizing the probability of taking the shortest path, minimizing a linear combination of mean and variance of path delay, and minimizing the expected delay at a specified risky quantile of the delay distributions, respectively. We prove that for any finite stochastic atomic congestion game, the risk-averse, mean-variance, and CVaR$_alpha$ equilibria exist. We show that for risk-averse travelers, the Braess paradox may not occur to the extent presented originally since players do not necessarily travel along the shortest path in expectation, but they take the uncertainty of travel time into consideration as well. We show through some examples that the price of anarchy can be improved when players are risk-averse and travel according to one of the three classes of risk-averse equilibria rather than the Wardrop equilibrium.
305 - Utsav Sadana , Erick Delage 2020
Conditional Value at Risk (CVaR) is widely used to account for the preferences of a risk-averse agent in the extreme loss scenarios. To study the effectiveness of randomization in interdiction games with an interdictor that is both risk and ambiguity averse, we introduce a distributionally robust network interdiction game where the interdictor randomizes over the feasible interdiction plans in order to minimize the worst-case CVaR of the flow with respect to both the unknown distribution of the capacity of the arcs and his mixed strategy over interdicted arcs. The flow player, on the contrary, maximizes the total flow in the network. By using the budgeted uncertainty set, we control the degree of conservatism in the model and reformulate the interdictors non-linear problem as a bi-convex optimization problem. For solving this problem to any given optimality level, we devise a spatial branch and bound algorithm that uses the McCormick inequalities and reduced reformulation linearization technique (RRLT) to obtain convex relaxation of the problem. We also develop a column generation algorithm to identify the optimal support of the convex relaxation which is then used in the coordinate descent algorithm to determine the upper bounds. The efficiency and convergence of the spatial branch and bound algorithm is established in the numerical experiments. Further, our numerical experiments show that randomized strategies can have significantly better in-sample and out-of-sample performance than optimal deterministic ones.
The large majority of risk-sharing transactions involve few agents, each of whom can heavily influence the structure and the prices of securities. This paper proposes a game where agents strategic sets consist of all possible sharing securities and p ricing kernels that are consistent with Arrow-Debreu sharing rules. First, it is shown that agents best response problems have unique solutions. The risk-sharing Nash equilibrium admits a finite-dimensional characterisation and it is proved to exist for arbitrary number of agents and be unique in the two-agent game. In equilibrium, agents declare beliefs on future random outcomes different than their actual probability assessments, and the risk-sharing securities are endogenously bounded, implying (among other things) loss of efficiency. In addition, an analysis regarding extremely risk tolerant agents indicates that they profit more from the Nash risk-sharing equilibrium as compared to the Arrow-Debreu one.
We motivate and propose a new model for non-cooperative Markov game which considers the interactions of risk-aware players. This model characterizes the time-consistent dynamic risk from both stochastic state transitions (inherent to the game) and ra ndomized mixed strategies (due to all other players). An appropriate risk-aware equilibrium concept is proposed and the existence of such equilibria is demonstrated in stationary strategies by an application of Kakutanis fixed point theorem. We further propose a simulation-based Q-learning type algorithm for risk-aware equilibrium computation. This algorithm works with a special form of minimax risk measures which can naturally be written as saddle-point stochastic optimization problems, and covers many widely investigated risk measures. Finally, the almost sure convergence of this simulation-based algorithm to an equilibrium is demonstrated under some mild conditions. Our numerical experiments on a two player queuing game validate the properties of our model and algorithm, and demonstrate their worth and applicability in real life competitive decision-making.
This paper considers a non-cooperative game in which competing users sharing a frequency-selective interference channel selfishly optimize their power allocation in order to improve their achievable rates. Previously, it was shown that a user having the knowledge of its opponents channel state information can make foresighted decisions and substantially improve its performance compared with the case in which it deploys the conventional iterative water-filling algorithm, which does not exploit such knowledge. This paper discusses how a foresighted user can acquire this knowledge by modeling its experienced interference as a function of its own power allocation. To characterize the outcome of the multi-user interaction, the conjectural equilibrium is introduced, and the existence of this equilibrium for the investigated water-filling game is proved. Interestingly, both the Nash equilibrium and the Stackelberg equilibrium are shown to be special cases of the generalization of conjectural equilibrium. We develop practical algorithms to form accurate beliefs and search desirable power allocation strategies. Numerical simulations indicate that a foresighted user without any a priori knowledge of its competitors private information can effectively learn the required information, and induce the entire system to an operating point that improves both its own achievable rate as well as the rates of the other participants in the water-filling game.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا