ترغب بنشر مسار تعليمي؟ اضغط هنا

A maximum entropy model of bounded rational decision-making with prior beliefs and market feedback

158   0   0.0 ( 0 )
 نشر من قبل Benjamin Patrick Evans
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Bounded rationality is an important consideration stemming from the fact that agents often have limits on their processing abilities, making the assumption of perfect rationality inapplicable to many real tasks. We propose an information-theoretic approach to the inference of agent decisions under Smithian competition. The model explicitly captures the boundedness of agents (limited in their information-processing capacity) as the cost of information acquisition for expanding their prior beliefs. The expansion is measured as the Kullblack-Leibler divergence between posterior decisions and prior beliefs. When information acquisition is free, the homo economicus agent is recovered, while in cases when information acquisition becomes costly, agents instead revert to their prior beliefs. The maximum entropy principle is used to infer least-biased decisions based upon the notion of Smithian competition formalised within the Quantal Response Statistical Equilibrium framework. The incorporation of prior beliefs into such a framework allowed us to systematically explore the effects of prior beliefs on decision-making in the presence of market feedback, as well as importantly adding a temporal interpretation to the framework. We verified the proposed model using Australian housing market data, showing how the incorporation of prior knowledge alters the resulting agent decisions. Specifically, it allowed for the separation of past beliefs and utility maximisation behaviour of the agent as well as the analysis into the evolution of agent beliefs.

قيم البحث

اقرأ أيضاً

This work explores a social learning problem with agents having nonidentical noise variances and mismatched beliefs. We consider an $N$-agent binary hypothesis test in which each agent sequentially makes a decision based not only on a private observa tion, but also on preceding agents decisions. In addition, the agents have their own beliefs instead of the true prior, and have nonidentical noise variances in the private signal. We focus on the Bayes risk of the last agent, where preceding agents are selfish. We first derive the optimal decision rule by recursive belief update and conclude, counterintuitively, that beliefs deviating from the true prior could be optimal in this setting. The effect of nonidentical noise levels in the two-agent case is also considered and analytical properties of the optimal belief curves are given. Next, we consider a predecessor selection problem wherein the subsequent agent of a certain belief chooses a predecessor from a set of candidates with varying beliefs. We characterize the decision region for choosing such a predecessor and argue that a subsequent agent with beliefs varying from the true prior often ends up selecting a suboptimal predecessor, indicating the need for a social planner. Lastly, we discuss an augmented intelligence design problem that uses a model of human behavior from cumulative prospect theory and investigate its near-optimality and suboptimality.
In a single-agent setting, reinforcement learning (RL) tasks can be cast into an inference problem by introducing a binary random variable o, which stands for the optimality. In this paper, we redefine the binary random variable o in multi-agent sett ing and formalize multi-agent reinforcement learning (MARL) as probabilistic inference. We derive a variational lower bound of the likelihood of achieving the optimality and name it as Regularized Opponent Model with Maximum Entropy Objective (ROMMEO). From ROMMEO, we present a novel perspective on opponent modeling and show how it can improve the performance of training agents theoretically and empirically in cooperative games. To optimize ROMMEO, we first introduce a tabular Q-iteration method ROMMEO-Q with proof of convergence. We extend the exact algorithm to complex environments by proposing an approximate version, ROMMEO-AC. We evaluate these two algorithms on the challenging iterated matrix game and differential game respectively and show that they can outperform strong MARL baselines.
Autonomous parking technology is a key concept within autonomous driving research. This paper will propose an imaginative autonomous parking algorithm to solve issues concerned with parking. The proposed algorithm consists of three parts: an imaginat ive model for anticipating results before parking, an improved rapid-exploring random tree (RRT) for planning a feasible trajectory from a given start point to a parking lot, and a path smoothing module for optimizing the efficiency of parking tasks. Our algorithm is based on a real kinematic vehicle model; which makes it more suitable for algorithm application on real autonomous cars. Furthermore, due to the introduction of the imagination mechanism, the processing speed of our algorithm is ten times faster than that of traditional methods, permitting the realization of real-time planning simultaneously. In order to evaluate the algorithms effectiveness, we have compared our algorithm with traditional RRT, within three different parking scenarios. Ultimately, results show that our algorithm is more stable than traditional RRT and performs better in terms of efficiency and quality.
Public debate forums provide a common platform for exchanging opinions on a topic of interest. While recent studies in natural language processing (NLP) have provided empirical evidence that the language of the debaters and their patterns of interact ion play a key role in changing the mind of a reader, research in psychology has shown that prior beliefs can affect our interpretation of an argument and could therefore constitute a competing alternative explanation for resistance to changing ones stance. To study the actual effect of language use vs. prior beliefs on persuasion, we provide a new dataset and propose a controlled setting that takes into consideration two reader level factors: political and religious ideology. We find that prior beliefs affected by these reader level factors play a more important role than language use effects and argue that it is important to account for them in NLP studies of persuasion.
Two maximization problems of Renyi entropy rate are investigated: the maximization over all stochastic processes whose marginals satisfy a linear constraint, and the Burg-like maximization over all stochastic processes whose autocovariance function b egins with some given values. The solutions are related to the solutions to the analogous maximization problems of Shannon entropy rate.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا