No Arabic abstract
Multiagent Systems (MAS) research reached a maturity to be confidently applied to real-life complex problems. Successful application of MAS methods for behavior modeling, strategic reasoning, and decentralized governance, encouraged us to focus on applicability of MAS techniques in a class of industrial systems and to elaborate on potentials and challenges for method integration/contextualization. We direct attention towards a form of industrial practices called Industrial Symbiosis Systems (ISS) as a highly dynamic domain of application for MAS techniques. In ISS, firms aim to reduce their material and energy footprint by circulating reusable resources among the members. To enable systematic reasoning about ISS behavior and support firms (as well as ISS designers) decisions, we see the opportunity for marrying industrial engineering with engineering multiagent systems. This enables introducing (1) representation frameworks to reason about dynamics of ISS, (2) operational semantics to develop computational models for ISS, and (3) coordination mechanisms to enforce desirable ISS behaviors. We argue for applicability and expressiveness of resource-bounded formalisms and norm-aware mechanisms for the design and deployment of ISS practices. In this proposal, we elaborate on different dimensions of ISS, present a methodological foundation for ISS development, and finally discuss open problems.
We present a formal multiagent framework for coordinating a class of collaborative industrial practices called Industrial Symbiotic Networks (ISNs) as cooperative games. The game-theoretic formulation of ISNs enables systematic reasoning about what we call the ISN implementation problem. Specifically, the characteristics of ISNs may lead to the inapplicability of standard fair and stable benefit allocation methods. Inspired by realistic ISN scenarios and following the literature on normative multiagent systems, we consider regulations and normative socio-economic policies as coordination instruments that in combination with ISN games resolve the situation. In this multiagent system, employing Marginal Contribution Nets (MC-Nets) as rule-based cooperative game representations foster the combination of regulations and ISN games with no loss in expressiveness. We develop algorithmic methods for generating regulations that ensure the implementability of ISNs and as a policy support, present the policy requirements that guarantee the implementability of all the desired ISNs in a balanced-budget way.
This paper discusses the dynamics of Transaction Cost (TC) in Industrial Symbiosis Institutions (ISI) and provides a fair and stable mechanism for TC allocation among the involved firms in a given ISI. In principle, industrial symbiosis, as an implementation of the circular economy paradigm in the context of industrial relation, is a practice aiming at reducing the material/energy footprint of the firm. The well-engineered form of this practice is proved to decrease the transaction costs at a collective level. This can be achieved using information systems for: identifying potential synergies, evaluating mutually beneficial ones, implementing the contracts, and governing the behavior of the established relations. Then the question is how to distribute the costs for maintaining such an information system in a fair and stable manner? We see such a cost as a collective transaction cost and employ an integrated method rooted in cooperative game theory and multiagent systems research to develop a fair and stable allocation mechanism for it. The novelty is twofold: in developing analytical multiagent methods for capturing the dynamics of transaction costs in industrial symbiosis and in presenting a novel game-theoretic mechanism for its allocation in industrial symbiosis institutions. While the former contributes to the theories of industrial symbiosis (methodological contribution), the latter supports decision makers aiming to specify fair and stable industrial symbiosis contracts (practical contribution).
Modeling agent behavior is central to understanding the emergence of complex phenomena in multiagent systems. Prior work in agent modeling has largely been task-specific and driven by hand-engineering domain-specific prior knowledge. We propose a general learning framework for modeling agent behavior in any multiagent system using only a handful of interaction data. Our framework casts agent modeling as a representation learning problem. Consequently, we construct a novel objective inspired by imitation learning and agent identification and design an algorithm for unsupervised learning of representations of agent policies. We demonstrate empirically the utility of the proposed framework in (i) a challenging high-dimensional competitive environment for continuous control and (ii) a cooperative environment for communication, on supervised predictive tasks, unsupervised clustering, and policy optimization using deep reinforcement learning.
Transfer Learning has shown great potential to enhance the single-agent Reinforcement Learning (RL) efficiency. Similarly, Multiagent RL (MARL) can also be accelerated if agents can share knowledge with each other. However, it remains a problem of how an agent should learn from other agents. In this paper, we propose a novel Multiagent Option-based Policy Transfer (MAOPT) framework to improve MARL efficiency. MAOPT learns what advice to provide and when to terminate it for each agent by modeling multiagent policy transfer as the option learning problem. Our framework provides two kinds of option learning methods in terms of what experience is used during training. One is the global option advisor, which uses the global experience for the update. The other is the local option advisor, which uses each agents local experience when only each agents local experiences can be obtained due to partial observability. While in this setting, each agents experience may be inconsistent with each other, which may cause the inaccuracy and oscillation of the option-values estimation. Therefore, we propose the successor representation option learning to solve it by decoupling the environment dynamics from rewards and learning the option-value under each agents preference. MAOPT can be easily combined with existing deep RL and MARL approaches, and experimental results show it significantly boosts the performance of existing methods in both discrete and continuous state spaces.
Collective human knowledge has clearly benefited from the fact that innovations by individuals are taught to others through communication. Similar to human social groups, agents in distributed learning systems would likely benefit from communication to share knowledge and teach skills. The problem of teaching to improve agent learning has been investigated by prior works, but these approaches make assumptions that prevent application of teaching to general multiagent problems, or require domain expertise for problems they can apply to. This learning to teach problem has inherent complexities related to measuring long-term impacts of teaching that compound the standard multiagent coordination challenges. In contrast to existing works, this paper presents the first general framework and algorithm for intelligent agents to learn to teach in a multiagent environment. Our algorithm, Learning to Coordinate and Teach Reinforcement (LeCTR), addresses peer-to-peer teaching in cooperative multiagent reinforcement learning. Each agent in our approach learns both when and what to advise, then uses the received advice to improve local learning. Importantly, these roles are not fixed; these agents learn to assume the role of student and/or teacher at the appropriate moments, requesting and providing advice in order to improve teamwide performance and learning. Empirical comparisons against state-of-the-art teaching methods show that our teaching agents not only learn significantly faster, but also learn to coordinate in tasks where existing methods fail.