ترغب بنشر مسار تعليمي؟ اضغط هنا

Levy walks derived from a Bayesian decision-making model in non-stationary environments

105   0   0.0 ( 0 )
 نشر من قبل Shuji Shinohara Shinohara
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Levy walks are found in the migratory behaviour patterns of various organisms, and the reason for this phenomenon has been much discussed. We use simulations to demonstrate that learning causes the changes in confidence level during decision-making in non-stationary environments, and results in Levy-walk-like patterns. One inference algorithm involving confidence is Bayesian inference. We propose an algorithm that introduces the effects of learning and forgetting into Bayesian inference, and simulate an imitation game in which two decision-making agents incorporating the algorithm estimate each others internal models from their opponents observational data. For forgetting without learning, agent confidence levels remained low due to a lack of information on the counterpart and Brownian walks occurred for a wide range of forgetting rates. Conversely, when learning was introduced, high confidence levels occasionally occurred even at high forgetting rates, and Brownian walks universally became Levy walks through a mixture of high- and low-confidence states.



قيم البحث

اقرأ أيضاً

154 - Shuo Li , Matteo Pozzi 2021
Decision makers involved in the management of civil assets and systems usually take actions under constraints imposed by societal regulations. Some of these constraints are related to epistemic quantities, as the probability of failure events and the corresponding risks. Sensors and inspectors can provide useful information supporting the control process (e.g. the maintenance process of an asset), and decisions about collecting this information should rely on an analysis of its cost and value. When societal regulations encode an economic perspective that is not aligned with that of the decision makers, the Value of Information (VoI) can be negative (i.e., information sometimes hurts), and almost irrelevant information can even have a significant value (either positive or negative), for agents acting under these epistemic constraints. We refer to these phenomena as Information Avoidance (IA) and Information OverValuation (IOV). In this paper, we illustrate how to assess VoI in sequential decision making under epistemic constraints (as those imposed by societal regulations), by modeling a Partially Observable Markov Decision Processes (POMDP) and evaluating non optimal policies via Finite State Controllers (FSCs). We focus on the value of collecting information at current time, and on that of collecting sequential information, we illustrate how these values are related and we discuss how IA and IOV can occur in those settings.
Potential buyers of a product or service tend to read reviews from previous consumers before making their decisions. This behavior is modeled by a market of Bayesian consumers with heterogeneous preferences, who sequentially decide whether to buy an item of unknown quality, based on previous buyers reviews. The quality is multi-dimensional and the reviews can assume one of different forms and can also be multi-dimensional. The belief about the items quality in simple uni-dimensional settings is known to converge to its true value. Our paper extends this result to the more general case of a multidimensional quality, possibly in a continuous space, and provides anytime convergence rates. In practice, the quality of an item may vary over time, due to some change in the production process or the need to keep up with the competition. This paper also studies the learning dynamic when the unknown quality changes at random times and shows that the cost of learning is rather small
119 - Song-Ju Kim , Masashi Aono , 2014
We demonstrate that any physical object, as long as its volume is conserved when coupled with suitable operations, provides a sophisticated decision-making capability. We consider the problem of finding, as accurately and quickly as possible, the mos t profitable option from a set of options that gives stochastic rewards. These decisions are made as dictated by a physical object, which is moved in a manner similar to the fluctuations of a rigid body in a tug-of-war game. Our analytical calculations validate statistical reasons why our method exhibits higher efficiency than conventional algorithms.
We propose a new approach for solving a class of discrete decision making problems under uncertainty with positive cost. This issue concerns multiple and diverse fields such as engineering, economics, artificial intelligence, cognitive science and ma ny others. Basically, an agent has to choose a single or series of actions from a set of options, without knowing for sure their consequences. Schematically, two main approaches have been followed: either the agent learns which option is the correct one to choose in a given situation by trial and error, or the agent already has some knowledge on the possible consequences of his decisions; this knowledge being generally expressed as a conditional probability distribution. In the latter case, several optimal or suboptimal methods have been proposed to exploit this uncertain knowledge in various contexts. In this work, we propose following a different approach, based on the geometric intuition of distance. More precisely, we define a goal independent quasimetric structure on the state space, taking into account both cost function and transition probability. We then compare precision and computation time with classical approaches.
We study the design of autonomous agents that are capable of deceiving outside observers about their intentions while carrying out tasks in stochastic, complex environments. By modeling the agents behavior as a Markov decision process, we consider a setting where the agent aims to reach one of multiple potential goals while deceiving outside observers about its true goal. We propose a novel approach to model observer predictions based on the principle of maximum entropy and to efficiently generate deceptive strategies via linear programming. The proposed approach enables the agent to exhibit a variety of tunable deceptive behaviors while ensuring the satisfaction of probabilistic constraints on the behavior. We evaluate the performance of the proposed approach via comparative user studies and present a case study on the streets of Manhattan, New York, using real travel time distributions.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا