بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Convergence of Learning Dynamics in Information Retrieval Games

124 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Omer Ben-Porat

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Omer Ben-Porat - Itay Rosenberg - Moshe Tennenholtz

علوم الكمبيوتر ونظرية الألعاب استرجاع المعلومات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We consider a game-theoretic model of information retrieval with strategic authors. We examine two different utility schemes: authors who aim at maximizing exposure and authors who want to maximize active selection of their content (i.e. the number of clicks). We introduce the study of author learning dynamics in such contexts. We prove that under the probability ranking principle (PRP), which forms the basis of the current state of the art ranking methods, any better-response learning dynamics converges to a pure Nash equilibrium. We also show that other ranking methods induce a strategic environment under which such a convergence may not occur.

قيم البحث

92 - Le Cong Dinh , Long Tran-Thanh , Tri-Dung Nguyen 2020

This paper considers repeated games in which one player has more information about the game than the other players. In particular, we investigate repeated two-player zero-sum games where only the column player knows the payoff matrix A of the game. S uppose that while repeatedly playing this game, the row player chooses her strategy at each round by using a no-regret algorithm to minimize her (pseudo) regret. We develop a no-instant-regret algorithm for the column player to exhibit last round convergence to a minimax equilibrium. We show that our algorithm is efficient against a large set of popular no-regret algorithms of the row player, including the multiplicative weight update algorithm, the online mirror descent method/follow-the-regularized-leader, the linear multiplicative weight update algorithm, and the optimistic multiplicative weight update.

علوم الكمبيوتر ونظرية الألعاب

Using HMM in Strategic Games

404 - Mario Benevides 2014

In this paper we describe an approach to resolve strategic games in which players can assume different types along the game. Our goal is to infer which type the opponent is adopting at each moment so that we can increase the players odds. To achieve that we use Markov games combined with hidden Markov model. We discuss a hypothetical example of a tennis game whose solution can be applied to any game with similar characteristics.

علوم الكمبيوتر ونظرية الألعاب استرجاع المعلومات التعلم الآلي

Value of information in noncooperative games

541 - Nils Bertschinger , David H. Wolpert , Eckehard Olbrich andn Juergen Jost 2013

In some games, additional information hurts a player, e.g., in games with first-mover advantage, the second-mover is hurt by seeing the first-movers move. What properties of a game determine whether it has such negative value of information for a par ticular player? Can a game have negative value of information for all players? To answer such questions, we generalize the definition of marginal utility of a good to define the marginal utility of a parameter vector specifying a game. So rather than analyze the global structure of the relationship between a games parameter vector and player behavior, as in previous work, we focus on the local structure of that relationship. This allows us to prove that generically, every game can have negative marginal value of information, unless one imposes a priori constraints on allowed changes to the games parameter vector. We demonstrate these and related results numerically, and discuss their implications.

علوم الكمبيوتر ونظرية الألعاب

Information Design in Large Games

135 - Frederic Koessler , Marco Scarsini , 2021

We define the notion of Bayes correlated Wardrop equilibrium for general nonatomic games with anonymous players and incomplete information. Bayes correlated Wardrop equilibria describe the set of equilibrium outcomes when a mediator, such as a traffi c information system, provides information to the players. We relate this notion to Bayes Wardrop equilibrium. Then, we provide conditions -- existence of a convex potential and complete information -- under which mediation does not improve equilibrium outcomes. We then study full implementation and, finally, information design in anonymous games with a finite set of players, when the number of players tends to infinity.

علوم الكمبيوتر ونظرية الألعاب الاقتصاد النظري

Combining Deep Reinforcement Learning and Search for Imperfect-Information Games

92 - Noam Brown , Anton Bakhtin , Adam Lerer 2020

The combination of deep reinforcement learning and search at both training and test time is a powerful paradigm that has led to a number of successes in single-agent settings and perfect-information games, best exemplified by AlphaZero. However, prio r algorithms of this form cannot cope with imperfect-information games. This paper presents ReBeL, a general framework for self-play reinforcement learning and search that provably converges to a Nash equilibrium in any two-player zero-sum game. In the simpler setting of perfect-information games, ReBeL reduces to an algorithm similar to AlphaZero. Results in two different imperfect-information games show ReBeL converges to an approximate Nash equilibrium. We also show ReBeL achieves superhuman performance in heads-up no-limit Texas holdem poker, while using far less domain knowledge than any prior poker AI.

علوم الكمبيوتر ونظرية الألعاب الذكاء الاصطناعي التعلم الآلي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة قرطبة الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Convergence of Learning Dynamics in Information Retrieval Games

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً