ترغب بنشر مسار تعليمي؟ اضغط هنا

Planning Problems for Sophisticated Agents with Present Bias

61   0   0.0 ( 0 )
 نشر من قبل Sigal Oren
 تاريخ النشر 2016
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Present bias, the tendency to weigh costs and benefits incurred in the present too heavily, is one of the most widespread human behavioral biases. It has also been the subject of extensive study in the behavioral economics literature. While the simplest models assume that the agents are naive, reasoning about the future without taking their bias into account, there is considerable evidence that people often behave in ways that are sophisticated with respect to present bias, making plans based on the belief that they will be present-biased in the future. For example, committing to a course of action to reduce future opportunities for procrastination or overconsumption are instances of sophisticated behavior in everyday life. Models of sophisticated behavior have lacked an underlying formalism that allows one to reason over the full space of multi-step tasks that a sophisticated agent might face. This has made it correspondingly difficult to make comparative or worst-case statements about the performance of sophisticated agents in arbitrary scenarios. In this paper, we incorporate the notion of sophistication into a graph-theoretic model that we used in recent work for modeling naive agents. This new synthesis of two formalisms - sophistication and graph-theoretic planning - uncovers a rich structure that wasnt apparent in the earlier behavioral economics work on this problem. In particular, our graph-theoretic model makes two kinds of new results possible. First, we give tight worst-case bounds on the performance of sophisticated agents in arbitrary multi-step tasks relative to the optimal plan. Second, the flexibility of our formalism makes it possible to identify new phenomena that had not been seen in prior literature: these include a surprising non-monotonic property in the use of rewards to motivate sophisticated agents and a framework for reasoning about commitment devices.



قيم البحث

اقرأ أيضاً

Recent work has considered theoretical models for the behavior of agents with specific behavioral biases: rather than making decisions that optimize a given payoff function, the agent behaves inefficiently because its decisions suffer from an underly ing bias. These approaches have generally considered an agent who experiences a single behavioral bias, studying the effect of this bias on the outcome. In general, however, decision-making can and will be affected by multiple biases operating at the same time. How do multiple biases interact to produce the overall outcome? Here we consider decisions in the presence of a pair of biases exhibiting an intuitively natural interaction: present bias -- the tendency to value costs incurred in the present too highly -- and sunk-cost bias -- the tendency to incorporate costs experienced in the past into ones plans for the future. We propose a theoretical model for planning with this pair of biases, and we show how certain natural behavioral phenomena can arise in our model only when agents exhibit both biases. As part of our model we differentiate between agents that are aware of their biases (sophisticated) and agents that are unaware of them (naive). Interestingly, we show that the interaction between the two biases is quite complex: in some cases, they mitigate each others effects while in other cases they might amplify each other. We obtain a number of further results as well, including the fact that the planning problem in our model for an agent experiencing and aware of both biases is computationally hard in general, though tractable under more relaxed assumptions.
Individuals working towards a goal often exhibit time inconsistent behavior, making plans and then failing to follow through. One well-known model of such behavioral anomalies is present-bias discounting: individuals over-weight present costs by a bi as factor. This model explains many time-inconsistent behaviors, but can make stark predictions in many settings: individuals either follow the most efficient plan for reaching their goal or procrastinate indefinitely. We propose a modification in which the present-bias parameter can vary over time, drawn independently each step from a fixed distribution. Following Kleinberg and Oren (2014), we use a weighted task graph to model task planning, and measure the cost of procrastination as the relative expected cost of the chosen path versus the optimal path. We use a novel connection to optimal pricing theory to describe the structure of the worst-case task graph for any present-bias distribution. We then leverage this structure to derive conditions on the bias distribution under which the worst-case ratio is exponential (in time) or constant. We also examine conditions on the task graph that lead to improved procrastination ratios: graphs with a uniformly bounded distance to the goal, and graphs in which the distance to the goal monotonically decreases on any path.
Modeling social interactions based on individual behavior has always been an area of interest, but prior literature generally presumes rational behavior. Thus, such models may miss out on capturing the effects of biases humans are susceptible to. Thi s work presents a method to model egocentric bias, the real-life tendency to emphasize ones own opinion heavily when presented with multiple opinions. We use a symmetric distribution centered at an agents own opinion, as opposed to the Bounded Confidence (BC) model used in prior work. We consider a game of iterated interactions where an agent cooperates based on its opinion about an opponent. Our model also includes the concept of domain-based self-doubt, which varies as the interaction succeeds or not. An increase in doubt makes an agent reduce its egocentricity in subsequent interactions, thus enabling the agent to learn reactively. The agent system is modeled with factions not having a single leader, to overcome some of the issues associated with leader-follower factions. We find that agents belonging to factions perform better than individual agents. We observe that an intermediate level of egocentricity helps the agent perform at its best, which concurs with conventional wisdom that neither overconfidence nor low self-esteem brings benefits.
From skipped exercise classes to last-minute cancellation of dentist appointments, underutilization of reserved resources abounds. Likely reasons include uncertainty about the future, further exacerbated by present bias. In this paper, we unite resou rce allocation and commitment devices through the design of contingent payment mechanisms, and propose the two-bid penalty-bidding mechanism. This extends an earlier mechanism proposed by Ma et al. (2019), assigning the resources based on willingness to accept a no-show penalty, while also allowing each participant to increase her own penalty in order to counter present bias. We establish a simple dominant strategy equilibrium, regardless of an agents level of present bias or degree of sophistication. Via simulations, we show that the proposed mechanism substantially improves utilization and achieves higher welfare and better equity in comparison with mechanisms used in practice and mechanisms that optimize welfare in the absence of present bias.
The flow of information reaching us via the online media platforms is optimized not by the information content or relevance but by popularity and proximity to the target. This is typically performed in order to maximise platform usage. As a side effe ct, this introduces an algorithmic bias that is believed to enhance polarization of the societal debate. To study this phenomenon, we modify the well-known continuous opinion dynamics model of bounded confidence in order to account for the algorithmic bias and investigate its consequences. In the simplest version of the original model the pairs of discussion participants are chosen at random and their opinions get closer to each other if they are within a fixed tolerance level. We modify the selection rule of the discussion partners: there is an enhanced probability to choose individuals whose opinions are already close to each other, thus mimicking the behavior of online media which suggest interaction with similar peers. As a result we observe: a) an increased tendency towards polarization, which emerges also in conditions where the original model would predict convergence, and b) a dramatic slowing down of the speed at which the convergence at the asymptotic state is reached, which makes the system highly unstable. Polarization is augmented by a fragmented initial population.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا