ترغب بنشر مسار تعليمي؟ اضغط هنا

Planning with Multiple Biases

52   0   0.0 ( 0 )
 نشر من قبل Sigal Oren
 تاريخ النشر 2017
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Recent work has considered theoretical models for the behavior of agents with specific behavioral biases: rather than making decisions that optimize a given payoff function, the agent behaves inefficiently because its decisions suffer from an underlying bias. These approaches have generally considered an agent who experiences a single behavioral bias, studying the effect of this bias on the outcome. In general, however, decision-making can and will be affected by multiple biases operating at the same time. How do multiple biases interact to produce the overall outcome? Here we consider decisions in the presence of a pair of biases exhibiting an intuitively natural interaction: present bias -- the tendency to value costs incurred in the present too highly -- and sunk-cost bias -- the tendency to incorporate costs experienced in the past into ones plans for the future. We propose a theoretical model for planning with this pair of biases, and we show how certain natural behavioral phenomena can arise in our model only when agents exhibit both biases. As part of our model we differentiate between agents that are aware of their biases (sophisticated) and agents that are unaware of them (naive). Interestingly, we show that the interaction between the two biases is quite complex: in some cases, they mitigate each others effects while in other cases they might amplify each other. We obtain a number of further results as well, including the fact that the planning problem in our model for an agent experiencing and aware of both biases is computationally hard in general, though tractable under more relaxed assumptions.



قيم البحث

اقرأ أيضاً

Present bias, the tendency to weigh costs and benefits incurred in the present too heavily, is one of the most widespread human behavioral biases. It has also been the subject of extensive study in the behavioral economics literature. While the simpl est models assume that the agents are naive, reasoning about the future without taking their bias into account, there is considerable evidence that people often behave in ways that are sophisticated with respect to present bias, making plans based on the belief that they will be present-biased in the future. For example, committing to a course of action to reduce future opportunities for procrastination or overconsumption are instances of sophisticated behavior in everyday life. Models of sophisticated behavior have lacked an underlying formalism that allows one to reason over the full space of multi-step tasks that a sophisticated agent might face. This has made it correspondingly difficult to make comparative or worst-case statements about the performance of sophisticated agents in arbitrary scenarios. In this paper, we incorporate the notion of sophistication into a graph-theoretic model that we used in recent work for modeling naive agents. This new synthesis of two formalisms - sophistication and graph-theoretic planning - uncovers a rich structure that wasnt apparent in the earlier behavioral economics work on this problem. In particular, our graph-theoretic model makes two kinds of new results possible. First, we give tight worst-case bounds on the performance of sophisticated agents in arbitrary multi-step tasks relative to the optimal plan. Second, the flexibility of our formalism makes it possible to identify new phenomena that had not been seen in prior literature: these include a surprising non-monotonic property in the use of rewards to motivate sophisticated agents and a framework for reasoning about commitment devices.
Unmanned aerial vehicles (UAVs) are expected to be an integral part of wireless networks. In this paper, we aim to find collision-free paths for multiple cellular-connected UAVs, while satisfying requirements of connectivity with ground base stations (GBSs) in the presence of a dynamic jammer. We first formulate the problem as a sequential decision making problem in discrete domain, with connectivity, collision avoidance, and kinematic constraints. We, then, propose an offline temporal difference (TD) learning algorithm with online signal-to-interference-plus-noise ratio (SINR) mapping to solve the problem. More specifically, a value network is constructed and trained offline by TD method to encode the interactions among the UAVs and between the UAVs and the environment; and an online SINR mapping deep neural network (DNN) is designed and trained by supervised learning, to encode the influence and changes due to the jammer. Numerical results show that, without any information on the jammer, the proposed algorithm can achieve performance levels close to that of the ideal scenario with the perfect SINR-map. Real-time navigation for multi-UAVs can be efficiently performed with high success rates, and collisions are avoided.
In this paper, we design gross product maximization mechanisms which incentivize users to upload high-quality contents on user-generated-content (UGC) websites. We show that, the proportional division mechanism, which is widely used in practice, can perform arbitrarily bad in the worst case. The problem can be formulated using a linear program with bounded and increasing variables. We then present an $O(nlog n)$ algorithm to find the optimal mechanism, where n is the number of players.
We examine the problem of assigning plots of land to prospective buyers who prefer living next to their friends. They care not only about the plot they receive, but also about their neighbors. This externality results in a highly non-trivial problem structure, as both friendship and land value play a role in determining agent behavior. We examine mechanisms that guarantee truthful reporting of both land values and friendships. We propose variants of random serial dictatorship (RSD) that can offer both truthfulness and welfare guarantees. Interestingly, our social welfare guarantees are parameterized by the value of friendship: if these values are low, enforcing truthful behavior results in poor welfare guarantees and imposes significant constraints on agents choices; if they are high, we achieve good approximation to the optimal social welfare.
We study secretary problems in settings with multiple agents. In the standard secretary problem, a sequence of arbitrary awards arrive online, in a random order, and a single decision maker makes an immediate and irrevocable decision whether to accep t each award upon its arrival. The requirement to make immediate decisions arises in many cases due to an implicit assumption regarding competition. Namely, if the decision maker does not take the offered award immediately, it will be taken by someone else. The novelty in this paper is in introducing a multi-agent model in which the competition is endogenous. In our model, multiple agents compete over the arriving awards, but the decisions need not be immediate; instead, agents may select previous awards as long as they are available (i.e., not taken by another agent). If an award is selected by multiple agents, ties are broken either randomly or according to a global ranking. This induces a multi-agent game in which the time of selection is not enforced by the rules of the games, rather it is an important component of the agents strategy. We study the structure and performance of equilibria in this game. For random tie breaking, we characterize the equilibria of the game, and show that the expected social welfare in equilibrium is nearly optimal, despite competition among the agents. For ranked tie breaking, we give a full characterization of equilibria in the 3-agent game, and show that as the number of agents grows, the winning probability of every agent under non-immediate selections approaches her winning probability under immediate selections.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا