ترغب بنشر مسار تعليمي؟ اضغط هنا

A Game-Theoretic Analysis of Updating Sets of Probabilities

97   0   0.0 ( 0 )
 نشر من قبل Peter D Grunwald
 تاريخ النشر 2014
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We consider how an agent should update her uncertainty when it is represented by a set P of probability distributions and the agent observes that a random variable X takes on value x, given that the agent makes decisions using the minimax criterion, perhaps the best-studied and most commonly-used criterion in the literature. We adopt a game-theoretic framework, where the agent plays against a bookie, who chooses some distribution from P. We consider two reasonable games that differ in what the bookie knows when he makes his choice. Anomalies that have been observed before, like time inconsistency, can be understood as arising because different games are being played, against bookies with different information. We characterize the important special cases in which the optimal decision rules according to the minimax criterion amount to either conditioning or simply ignoring the information. Finally, we consider the relationship between conditioning and calibration when uncertainty is described by sets of probabilities.



قيم البحث

اقرأ أيضاً

As examples such as the Monty Hall puzzle show, applying conditioning to update a probability distribution on a ``naive space, which does not take into account the protocol used, can often lead to counterintuitive results. Here we examine why. A crit erion known as CAR (coarsening at random) in the statistical literature characterizes when ``naive conditioning in a naive space works. We show that the CAR condition holds rather infrequently. We then consider more generalized notions of update such as Jeffrey conditioning and minimizing relative entropy (MRE). We give a generalization of the CAR condition that characterizes when Jeffrey conditioning leads to appropriate answers, but show that there are no such conditions for MRE. This generalizes and interconnects previous results obtained in the literature on CAR and MRE.
To achieve general intelligence, agents must learn how to interact with others in a shared environment: this is the challenge of multiagent reinforcement learning (MARL). The simplest form is independent reinforcement learning (InRL), where each agen t treats its experience as part of its (non-stationary) environment. In this paper, we first observe that policies learned using InRL can overfit to the other agents policies during training, failing to sufficiently generalize during execution. We introduce a new metric, joint-policy correlation, to quantify this effect. We describe an algorithm for general MARL, based on approximate best responses to mixtures of policies generated using deep reinforcement learning, and empirical game-theoretic analysis to compute meta-strategies for policy selection. The algorithm generalizes previous ones such as InRL, iterated best response, double oracle, and fictitious play. Then, we present a scalable implementation which reduces the memory requirement using decoupled meta-solvers. Finally, we demonstrate the generality of the resulting policies in two partially observable settings: gridworld coordination games and poker.
Low transaction throughput and poor scalability are significant issues in public blockchain consensus protocols such as Bitcoins. Recent research efforts in this direction have proposed shard-based consensus protocols where the key idea is to split t he transactions among multiple committees (or shards), which then process these shards or set of transactions in parallel. Such a parallel processing of disjoint sets of transactions or shards by multiple committees significantly improves the overall scalability and transaction throughout of the system. However, one significant research gap is a lack of understanding of the strategic behavior of rational processors within committees in such shard-based consensus protocols. Such an understanding is critical for designing appropriate incentives that will foster cooperation within committees and prevent free-riding. In this paper, we address this research gap by analyzing the behavior of processors using a game-theoretic model, where each processor aims at maximizing its reward at a minimum cost of participating in the protocol. We first analyze the Nash equilibria in an N-player static game model of the sharding protocol. We show that depending on the reward sharing approach employed, processors can potentially increase their payoff by unilaterally behaving in a defective fashion, thus resulting in a social dilemma. In order to overcome this social dilemma, we propose a novel incentive-compatible reward sharing mechanism to promote cooperation among processors. Our numerical results show that achieving a majority of cooperating processors (required to ensure a healthy state of the blockchain network) is easier to achieve with the proposed incentive-compatible reward sharing mechanism than with other reward sharing mechanisms.
In timeline-based planning, domains are described as sets of independent, but interacting, components, whose behaviour over time (the set of timelines) is governed by a set of temporal constraints. A distinguishing feature of timeline-based planning systems is the ability to integrate planning with execution by synthesising control strategies for flexible plans. However, flexible plans can only represent temporal uncertainty, while more complex forms of nondeterminism are needed to deal with a wider range of realistic problems. In this paper, we propose a novel game-theoretic approach to timeline-based planning problems, generalising the state of the art while uniformly handling temporal uncertainty and nondeterminism. We define a general concept of timeline-based game and we show that the notion of winning strategy for these games is strictly more general than that of control strategy for dynamically controllable flexible plans. Moreover, we show that the problem of establishing the existence of such winning strategies is decidable using a doubly exponential amount of space.
An interaction system has a finite set of agents that interact pairwise, depending on the current state of the system. Symmetric decomposition of the matrix of interaction coefficients yields the representation of states by self-adjoint matrices and hence a spectral representation. As a result, cooperation systems, decision systems and quantum systems all become visible as manifestations of special interaction systems. The treatment of the theory is purely mathematical and does not require any special knowledge of physics. It is shown how standard notions in cooperative game theory arise naturally in this context. In particular, Fourier transformation of cooperative games becomes meaningful. Moreover, quantum games fall into this framework. Finally, a theory of Markov evolution of interaction states is presented that generalizes classical homogeneous Markov chains to the present context.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا