Game-Theoretic Models of Moral and Other-Regarding Agents (extended abstract)

72 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل EPTCS

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Gabriel Istrate

علوم الكمبيوتر ونظرية الألعاب الذكاء الاصطناعي أنظمة متعددة العملاء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We investigate Kantian equilibria in finite normal form games, a class of non-Nashian, morally motivated courses of action that was recently proposed in the economics literature. We highlight a number of problems with such equilibria, including computational intractability, a high price of miscoordination, and problematic extension to general normal form games. We give such a generalization based on concept of program equilibria, and point out that that a practically relevant generalization may not exist. To remedy this we propose some general, intuitive, computationally tractable, other-regarding equilibria that are special cases Kantian equilibria, as well as a class of courses of action that interpolates between purely self-regarding and Kantian behavior.

قيم البحث

151 - Dmitry Ivanov , Vladimir Egorov , Aleksei Shpilman 2021

Recent reinforcement learning studies extensively explore the interplay between cooperative and competitive behaviour in mixed environments. Unlike cooperative environments where agents strive towards a common goal, mixed environments are notorious f or the conflicts of selfish and social interests. As a consequence, purely rational agents often struggle to achieve and maintain cooperation. A prevalent approach to induce cooperative behaviour is to assign additional rewards based on other agents well-being. However, this approach suffers from the issue of multi-agent credit assignment, which can hinder performance. This issue is efficiently alleviated in cooperative setting with such state-of-the-art algorithms as QMIX and COMA. Still, when applied to mixed environments, these algorithms may result in unfair allocation of rewards. We propose BAROCCO, an extension of these algorithms capable to balance individual and social incentives. The mechanism behind BAROCCO is to train two distinct but interwoven components that jointly affect each agents decisions. Our meta-algorithm is compatible with both Q-learning and Actor-Critic frameworks. We experimentally confirm the advantages over the existing methods and explore the behavioural aspects of BAROCCO in two mixed multi-agent setups.

التعلم الآلي الذكاء الاصطناعي أنظمة متعددة العملاء

Learning to Incentivize Other Learning Agents

206 - Jiachen Yang , Ang Li , Mehrdad Farajtabar 2020

The challenge of developing powerful and general Reinforcement Learning (RL) agents has received increasing attention in recent years. Much of this effort has focused on the single-agent setting, in which an agent maximizes a predefined extrinsic rew ard function. However, a long-term question inevitably arises: how will such independent agents cooperate when they are continually learning and acting in a shared multi-agent environment? Observing that humans often provide incentives to influence others behavior, we propose to equip each RL agent in a multi-agent environment with the ability to give rewards directly to other agents, using a learned incentive function. Each agent learns its own incentive function by explicitly accounting for its impact on the learning of recipients and, through them, the impact on its own extrinsic objective. We demonstrate in experiments that such agents significantly outperform standard RL and opponent-shaping agents in challenging general-sum Markov games, often by finding a near-optimal division of labor. Our work points toward more opportunities and challenges along the path to ensure the common good in a multi-agent future.

التعلم الآلي علوم الكمبيوتر ونظرية الألعاب أنظمة متعددة العملاء

A Game-Theoretic Analysis of Shard-Based Permissionless Blockchains

160 - Mohammad Hossein Manshaei , Murtuza Jadliwala , Anindya Maiti 2018

Low transaction throughput and poor scalability are significant issues in public blockchain consensus protocols such as Bitcoins. Recent research efforts in this direction have proposed shard-based consensus protocols where the key idea is to split t he transactions among multiple committees (or shards), which then process these shards or set of transactions in parallel. Such a parallel processing of disjoint sets of transactions or shards by multiple committees significantly improves the overall scalability and transaction throughout of the system. However, one significant research gap is a lack of understanding of the strategic behavior of rational processors within committees in such shard-based consensus protocols. Such an understanding is critical for designing appropriate incentives that will foster cooperation within committees and prevent free-riding. In this paper, we address this research gap by analyzing the behavior of processors using a game-theoretic model, where each processor aims at maximizing its reward at a minimum cost of participating in the protocol. We first analyze the Nash equilibria in an N-player static game model of the sharding protocol. We show that depending on the reward sharing approach employed, processors can potentially increase their payoff by unilaterally behaving in a defective fashion, thus resulting in a social dilemma. In order to overcome this social dilemma, we propose a novel incentive-compatible reward sharing mechanism to promote cooperation among processors. Our numerical results show that achieving a majority of cooperating processors (required to ensure a healthy state of the blockchain network) is easier to achieve with the proposed incentive-compatible reward sharing mechanism than with other reward sharing mechanisms.

علوم الكمبيوتر ونظرية الألعاب

Optimality and Stability in Federated Learning: A Game-theoretic Approach

612 - Kate Donahue , Jon Kleinberg 2021

Federated learning is a distributed learning paradigm where multiple agents, each only with access to local data, jointly learn a global model. There has recently been an explosion of research aiming not only to improve the accuracy rates of federate d learning, but also provide certain guarantees around social good properties such as total error. One branch of this research has taken a game-theoretic approach, and in particular, prior work has viewed federated learning as a hedonic game, where error-minimizing players arrange themselves into federating coalitions. This past work proves the existence of stable coalition partitions, but leaves open a wide range of questions, including how far from optimal these stable solutions are. In this work, we motivate and define a notion of optimality given by the average error rates among federating agents (players). First, we provide and prove the correctness of an efficient algorithm to calculate an optimal (error minimizing) arrangement of players. Next, we analyze the relationship between the stability and optimality of an arrangement. First, we show that for some regions of parameter space, all stable arrangements are optimal (Price of Anarchy equal to 1). However, we show this is not true for all settings: there exist examples of stable arrangements with higher cost than optimal (Price of Anarchy greater than 1). Finally, we give the first constant-factor bound on the performance gap between stability and optimality, proving that the total error of the worst stable solution can be no higher than 9 times the total error of an optimal solution (Price of Anarchy bound of 9).

علوم الكمبيوتر ونظرية الألعاب أجهزة الكمبيوتر والمجتمع النظم الموزعة والتوازية والحوسبة العنقودية

A Game-Theoretic Approach for Hierarchical Policy-Making

157 - Feiran Jia , Aditya Mate , Zun Li 2021

We present the design and analysis of a multi-level game-theoretic model of hierarchical policy-making, inspired by policy responses to the COVID-19 pandemic. Our model captures the potentially mismatched priorities among a hierarchy of policy-makers (e.g., federal, state, and local governments) with respect to two main cost components that have opposite dependence on the policy strength, such as post-intervention infection rates and the cost of policy implementation. Our model further includes a crucial third factor in decisions: a cost of non-compliance with the policy-maker immediately above in the hierarchy, such as non-compliance of state with federal policies. Our first contribution is a closed-form approximation of a recently published agent-based model to compute the number of infections for any implemented policy. Second, we present a novel equilibrium selection criterion that addresses common issues with equilibrium multiplicity in our setting. Third, we propose a hierarchical algorithm based on best response dynamics for computing an approximate equilibrium of the hierarchical policy-making game consistent with our solution concept. Finally, we present an empirical investigation of equilibrium policy strategies in this game in terms of the extent of free riding as well as fairness in the distribution of costs depending on game parameters such as the degree of centralization and disagreements about policy priorities among the agents.

علوم الكمبيوتر ونظرية الألعاب أنظمة متعددة العملاء