ترغب بنشر مسار تعليمي؟ اضغط هنا

Models of Level-0 Behavior for Predicting Human Behavior in Games

111   0   0.0 ( 0 )
 نشر من قبل James Wright
 تاريخ النشر 2016
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Behavioral game theory seeks to describe the way actual people (as compared to idealized, rational agents) act in strategic situations. Our own recent work has identified iterative models (such as quantal cognitive hierarchy) as the state of the art for predicting human play in unrepeated, simultaneous-move games (Wright & Leyton-Brown 2012, 2016). Iterative models predict that agents reason iteratively about their opponents, building up from a specification of nonstrategic behavior called level-0. The modeler is in principle free to choose any description of level-0 behavior that makes sense for the setting. However, almost all existing work specifies this behavior as a uniform distribution over actions. In most games it is not plausible that even nonstrategic agents would choose an action uniformly at random, nor that other agents would expect them to do so. A more accurate model for level-0 behavior has the potential to dramatically improve predictions of human behavior, since a substantial fraction of agents may play level-0 strategies directly, and furthermore since iterative models ground all higher-level strategies in responses to the level-0 strategy. Our work considers models of the way in which level-0 agents construct a probability distribution over actions, given an arbitrary game. Using a Bayesian optimization package called SMAC (Hutter, Hoos, & Leyton-Brown, 2010, 2011, 2012), we systematically evaluated a large space of such models, each of which makes its prediction based only on general features that can be computed from any normal form game. In the end, we recommend a model that achieved excellent performance across the board: a linear weighting of features that requires the estimation of four weights. We evaluated the effects of combining this new level-0 model with several iterative models, and observed large improvements in the models predictive accuracies.



قيم البحث

اقرأ أيضاً

It is common to assume that agents will adopt Nash equilibrium strategies; however, experimental studies have demonstrated that Nash equilibrium is often a poor description of human players behavior in unrepeated normal-form games. In this paper, we analyze five widely studied models (Quantal Response Equilibrium, Level-k, Cognitive Hierarchy, QLk, and Noisy Introspection) that aim to describe actual, rather than idealized, human behavior in such games. We performed what we believe is the most comprehensive meta-analysis of these models, leveraging ten different data sets from the literature recording human play of two-player games. We began by evaluating the models generalization or predictive performance, asking how well a model fits unseen test data after having had its parameters calibrated based on separate training data. Surprisingly, we found that what we dub the QLk model of Stahl & Wilson (1994) consistently achieved the best performance. Motivated by this finding, we describe methods for analyzing the posterior distributions over a models parameters. We found that QLks parameters were being set to values that were not consistent with their intended economic interpretations. We thus explored variations of QLk, ultimately identifying a new model family that has fewer parameters, gives rise to more parsimonious parameter values, and achieves better predictive performance.
The question of how people vote strategically under uncertainty has attracted much attention in several disciplines. Theoretical decision models have been proposed which vary in their assumptions on the sophistication of the voters and on the informa tion made available to them about others preferences and their voting behavior. This work focuses on modeling strategic voting behavior under poll information. It proposes a new heuristic for voting behavior that weighs the success of each candidate according to the poll score with the utility of the candidate given the voters preferences. The model weights can be tuned individually for each voter. We compared this model with other relevant voting models from the literature on data obtained from a recently released large scale study. We show that the new heuristic outperforms all other tested models. The prediction errors of the model can be partly explained due to inconsistent voters that vote for (weakly) dominated candidates.
Stackelberg security games are a critical tool for maximizing the utility of limited defense resources to protect important targets from an intelligent adversary. Motivated by green security, where the defender may only observe an adversarys response to defense on a limited set of targets, we study the problem of learning a defense that generalizes well to a new set of targets with novel feature values and combinations. Traditionally, this problem has been addressed via a two-stage approach where an adversary model is trained to maximize predictive accuracy without considering the defenders optimization problem. We develop an end-to-end game-focused approach, where the adversary model is trained to maximize a surrogate for the defenders expected utility. We show both in theory and experimental results that our game-focused approach achieves higher defender expected utility than the two-stage alternative when there is limited data.
Large-scale collection of human behavioral data by companies raises serious privacy concerns. We show that behavior captured in the form of application usage data collected from smartphones is highly unique even in very large datasets encompassing mi llions of individuals. This makes behavior-based re-identification of users across datasets possible. We study 12 months of data from 3.5 million users and show that four apps are enough to uniquely re-identify 91.2% of users using a simple strategy based on public information. Furthermore, we show that there is seasonal variability in uniqueness and that application usage fingerprints drift over time at an average constant rate.
423 - Valerio Capraro 2016
In the last few decades, numerous experiments have shown that humans do not always behave so as to maximize their material payoff. Cooperative behavior when non-cooperation is a dominant strategy (with respect to the material payoffs) is particularly puzzling. Here we propose a novel approach to explain cooperation, assuming what Halpern and Pass call translucent players. Typically, players are assumed to be opaque, in the sense that a deviation by one player in a normal-form game does not affect the strategies used by other players. But a player may believe that if he switches from one strategy to another, the fact that he chooses to switch may be visible to the other players. For example, if he chooses to defect in Prisoners Dilemma, the other player may sense his guilt. We show that by assuming translucent players, we can recover many of the regularities observed in human behavior in well-studied games such as Prisoners Dilemma, Travelers Dilemma, Bertrand Competition, and the Public Goods game.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا