Bounded rationality for relaxing best response and mutual consistency: An information-theoretic model of partial self-reference

428 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Benjamin Patrick Evans

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Benjamin Patrick Evans - Mikhail Prokopenko

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

While game theory has been transformative for decision-making, the assumptions made can be overly restrictive in certain instances. In this work, we focus on some of the assumptions underlying rationality such as mutual consistency and best-response, and consider ways to relax these assumptions using concepts from level-$k$ reasoning and quantal response equilibrium (QRE) respectively. Specifically, we provide an information-theoretic two-parameter model that can relax both mutual consistency and best-response, but can recover approximations of level-$k$, QRE, or typical Nash equilibrium behaviour in the limiting cases. The proposed approach is based on a recursive form of the variational free energy principle, representing self-referential games as (pseudo) sequential decisions. Bounds in player processing abilities are captured as information costs, where future chains of reasoning are discounted, implying a hierarchy of players where lower-level players have fewer processing resources.

قيم البحث

113 - Daniel Bashir , George D. Montanez , Sonia Sehra 2020

We present an information-theoretic framework for understanding overfitting and underfitting in machine learning and prove the formal undecidability of determining whether an arbitrary classification algorithm will overfit a dataset. Measuring algori thm capacity via the information transferred from datasets to models, we consider mismatches between algorithm capacities and datasets to provide a signature for when a model can overfit or underfit a dataset. We present results upper-bounding algorithm capacity, establish its relationship to quantities in the algorithmic search framework for machine learning, and relate our work to recent information-theoretic approaches to generalization.

التعلم الآلي الذكاء الاصطناعي نظرية المعلومات

Learning Convex Partitions and Computing Game-theoretic Equilibria from Best Response Queries

125 - Paul W. Goldberg , Francisco J. Marmolejo-Cossio 2018

Suppose that an $m$-simplex is partitioned into $n$ convex regions having disjoint interiors and distinct labels, and we may learn the label of any point by querying it. The learning objective is to know, for any point in the simplex, a label that oc curs within some distance $epsilon$ from that point. We present two algorithms for this task: Constant-Dimension Generalised Binary Search (CD-GBS), which for constant $m$ uses $poly(n, log left( frac{1}{epsilon} right))$ queries, and Constant-Region Generalised Binary Search (CR-GBS), which uses CD-GBS as a subroutine and for constant $n$ uses $poly(m, log left( frac{1}{epsilon} right))$ queries. We show via Kakutanis fixed-point theorem that these algorithms provide bounds on the best-response query complexity of computing approximate well-supported equilibria of bimatrix games in which one of the players has a constant number of pure strategies. We also partially extend our results to games with multiple players, establishing further query complexity bounds for computing approximate well-supported equilibria in this setting.

علوم الكمبيوتر ونظرية الألعاب التعلم الآلي

Diffusion, Influence and Best-Response Dynamics in Networks: An Action Model Approach

53 - Rasmus K. Rendsvig 2017

Threshold models and their dynamics may be used to model the spread of `behaviors in social networks. Regarding such from a modal logical perspective, it is shown how standard update mechanisms may be emulated using action models -- graphs encoding a gents decision rules. A small class of action models capturing the possible sets of decision rules suitable for threshold models is identified, and shown to include models characterizing best-response dynamics of both coordination and anti-coordination games played on graphs.

علوم الكمبيوتر ونظرية الألعاب المنطق في علوم الحاسوب أنظمة متعددة العملاء

Fully Decentralized Policies for Multi-Agent Systems: An Information Theoretic Approach

84 - Roel Dobbe , David Fridovich-Keil , Claire Tomlin 2017

Learning cooperative policies for multi-agent systems is often challenged by partial observability and a lack of coordination. In some settings, the structure of a problem allows a distributed solution with limited communication. Here, we consider a scenario where no communication is available, and instead we learn local policies for all agents that collectively mimic the solution to a centralized multi-agent static optimization problem. Our main contribution is an information theoretic framework based on rate distortion theory which facilitates analysis of how well the resulting fully decentralized policies are able to reconstruct the optimal solution. Moreover, this framework provides a natural extension that addresses which nodes an agent should communicate with to improve the performance of its individual policy.

أنظمة وتحكم الذكاء الاصطناعي نظرية المعلومات

An Information-Theoretic Framework for Fast and Robust Unsupervised Learning via Neural Population Infomax

105 - Wentao Huang , Kechen Zhang 2016

A framework is presented for unsupervised learning of representations based on infomax principle for large-scale neural populations. We use an asymptotic approximation to the Shannons mutual information for a large neural population to demonstrate th at a good initial approximation to the global information-theoretic optimum can be obtained by a hierarchical infomax method. Starting from the initial solution, an efficient algorithm based on gradient descent of the final objective function is proposed to learn representations from the input datasets, and the method works for complete, overcomplete, and undercomplete bases. As confirmed by numerical experiments, our method is robust and highly efficient for extracting salient features from input datasets. Compared with the main existing methods, our algorithm has a distinct advantage in both the training speed and the robustness of unsupervised representation learning. Furthermore, the proposed method is easily extended to the supervised or unsupervised model for training deep structure networks.

التعلم الآلي الذكاء الاصطناعي نظرية المعلومات