ترغب بنشر مسار تعليمي؟ اضغط هنا

Diversity and Exploration in Social Learning

300   0   0.0 ( 0 )
 نشر من قبل Jieming Mao
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

In consumer search, there is a set of items. An agent has a prior over her value for each item and can pay a cost to learn the instantiation of her value. After exploring a subset of items, the agent chooses one and obtains a payoff equal to its value minus the search cost. We consider a sequential model of consumer search in which agents values are correlated and each agent updates her priors based on the exploration of past agents before performing her search. Specifically, we assume the value is the sum of a common-value component, called the quality, and a subjective score. Fixing the variance of the total value, we say a population is more diverse if the subjective score has a larger variance. We ask how diversity impacts average utility. We show that intermediate diversity levels yield significantly higher social utility than the extreme cases of no diversity (when agents under-explore) or full diversity (when agents are unable to learn from each other) and quantify how the impact of the diversity level changes depending on the time spent searching.

قيم البحث

اقرأ أيضاً

Recent research on reinforcement learning in pure-conflict and pure-common interest games has emphasized the importance of population heterogeneity. In contrast, studies of reinforcement learning in mixed-motive games have primarily leveraged homogen eous approaches. Given the defining characteristic of mixed-motive games--the imperfect correlation of incentives between group members--we study the effect of population heterogeneity on mixed-motive reinforcement learning. We draw on interdependence theory from social psychology and imbue reinforcement learning agents with Social Value Orientation (SVO), a flexible formalization of preferences over group outcome distributions. We subsequently explore the effects of diversity in SVO on populations of reinforcement learning agents in two mixed-motive Markov games. We demonstrate that heterogeneity in SVO generates meaningful and complex behavioral variation among agents similar to that suggested by interdependence theory. Empirical results in these mixed-motive dilemmas suggest agents trained in heterogeneous populations develop particularly generalized, high-performing policies relative to those trained in homogeneous populations.
We consider a ubiquitous scenario in the Internet economy when individual decision-makers (henceforth, agents) both produce and consume information as they make strategic choices in an uncertain environment. This creates a three-way tradeoff between exploration (trying out insufficiently explored alternatives to help others in the future), exploitation (making optimal decisions given the information discovered by other agents), and incentives of the agents (who are myopically interested in exploitation, while preferring the others to explore). We posit a principal who controls the flow of information from agents that came before, and strives to coordinate the agents towards a socially optimal balance between exploration and exploitation, not using any monetary transfers. The goal is to design a recommendation policy for the principal which respects agents incentives and minimizes a suitable notion of regret. We extend prior work in this direction to allow the agents to interact with one another in a shared environment: at each time step, multiple agents arrive to play a Bayesian game, receive recommendations, choose their actions, receive their payoffs, and then leave the game forever. The agents now face two sources of uncertainty: the actions of the other agents and the parameters of the uncertain game environment. Our main contribution is to show that the principal can achieve constant regret when the utilities are deterministic (where the constant depends on the prior distribution, but not on the time horizon), and logarithmic regret when the utilities are stochastic. As a key technical tool, we introduce the concept of explorable actions, the actions which some incentive-compatible policy can recommend with non-zero probability. We show how the principal can identify (and explore) all explorable actions, and use the revealed information to perform optimally.
162 - Xuanyu Cao , K. J. Ray Liu 2017
In this work, we study the social learning problem, in which agents of a networked system collaborate to detect the state of the nature based on their private signals. A novel distributed graphical evolutionary game theoretic learning method is propo sed. In the proposed game-theoretic method, agents only need to communicate their binary decisions rather than the real-valued beliefs with their neighbors, which endows the method with low communication complexity. Under mean field approximations, we theoretically analyze the steady state equilibria of the game and show that the evolutionarily stable states (ESSs) coincide with the decisions of the benchmark centralized detector. Numerical experiments are implemented to confirm the effectiveness of the proposed game-theoretic learning method.
How users in a dynamic system perform learning and make decision become more and more important in numerous research fields. Although there are some works in the social learning literatures regarding how to construct belief on an uncertain system sta te, few study has been conducted on incorporating social learning with decision making. Moreover, users may have multiple concurrent decisions on different objects/resources and their decisions usually negatively influence each others utility, which makes the problem even more challenging. In this paper, we propose an Indian Buffet Game to study how users in a dynamic system learn the uncertain system state and make multiple concurrent decisions by not only considering the current myopic utility, but also taking into account the influence of subsequent users decisions. We analyze the proposed Indian Buffet Game under two different scenarios: customers request multiple dishes without budget constraint and with budget constraint. For both cases, we design recursive best response algorithms to find the subgame perfect Nash equilibrium for customers and characterize special properties of the Nash equilibrium profile under homogeneous setting. Moreover, we introduce a non-Bayesian social learning algorithm for customers to learn the system state, and theoretically prove its convergence. Finally, we conduct simulations to validate the effectiveness and efficiency of the proposed algorithms.
Social fragmentation caused by widening differences among constituents has recently become a highly relevant issue to our modern society. Theoretical models of social fragmentation using the adaptive network framework have been proposed and studied i n earlier literature, which are known to either converge to a homogeneous, well-connected network or fragment into many disconnected sub-networks with distinct states. Here we introduced the diversities of behavioral attributes among social constituents and studied their effects on social network evolution. We investigated, using a networked agent-based simulation model, how the resulting network states and topologies would be affected when individual constituents cultural tolerance, cultural state change rate, and edge weight change rate were systematically diversified. The results showed that the diversity of cultural tolerance had the most direct effect to keep the cultural diversity within the society high and simultaneously reduce the average shortest path length of the social network, which was not previously reported in the earlier literature. Diversities of other behavioral attributes also had effects on final states of the social network, with some nonlinear interactions. Our results suggest that having a broad distribution of cultural tolerance levels within society can help promote the coexistence of cultural diversity and structural connectivity.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا