ترغب بنشر مسار تعليمي؟ اضغط هنا

Other-Play for Zero-Shot Coordination

74   0   0.0 ( 0 )
 نشر من قبل Hengyuan Hu
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We consider the problem of zero-shot coordination - constructing AI agents that can coordinate with novel partners they have not seen before (e.g. humans). Standard Multi-Agent Reinforcement Learning (MARL) methods typically focus on the self-play (SP) setting where agents construct strategies by playing the game with themselves repeatedly. Unfortunately, applying SP naively to the zero-shot coordination problem can produce agents that establish highly specialized conventions that do not carry over to novel partners they have not been trained with. We introduce a novel learning algorithm called other-play (OP), that enhances self-play by looking for more robust strategies, exploiting the presence of known symmetries in the underlying problem. We characterize OP theoretically as well as experimentally. We study the cooperative card game Hanabi and show that OP agents achieve higher scores when paired with independently trained agents. In preliminary results we also show that our OP agents obtains higher average scores when paired with human players, compared to state-of-the-art SP agents.



قيم البحث

اقرأ أيضاً

151 - Yun Li , Zhe Liu , Lina Yao 2021
Zero-shot learning (ZSL) aims to transfer knowledge from seen classes to semantically related unseen classes, which are absent during training. The promising strategies for ZSL are to synthesize visual features of unseen classes conditioned on semant ic side information and to incorporate meta-learning to eliminate the models inherent bias towards seen classes. While existing meta generative approaches pursue a common model shared across task distributions, we aim to construct a generative network adaptive to task characteristics. To this end, we propose an Attribute-Modulated generAtive meta-model for Zero-shot learning (AMAZ). Our model consists of an attribute-aware modulation network, an attribute-augmented generative network, and an attribute-weighted classifier. Given unseen classes, the modulation network adaptively modulates the generator by applying task-specific transformations so that the generative network can adapt to highly diverse tasks. The weighted classifier utilizes the data quality to enhance the training procedure, further improving the model performance. Our empirical evaluations on four widely-used benchmarks show that AMAZ outperforms state-of-the-art methods by 3.8% and 3.1% in ZSL and generalized ZSL settings, respectively, demonstrating the superiority of our method. Our experiments on a zero-shot image retrieval task show AMAZs ability to synthesize instances that portray real visual characteristics.
External knowledge (a.k.a side information) plays a critical role in zero-shot learning (ZSL) which aims to predict with unseen classes that have never appeared in training data. Several kinds of external knowledge such as text and attribute have bee n widely investigated, but they alone are limited with incomplete semantics. Therefore, some very recent studies propose to use Knowledge Graph (KG) due to its high expressivity and compatibility for representing kinds of knowledge. However, the ZSL community is still short of standard benchmarks for studying and comparing different KG-based ZSL methods. In this paper, we proposed 5 resources for KG-based research in zero-shot image classification (ZS-IMGC) and zero-shot KG completion (ZS-KGC). For each resource, we contributed a benchmark and its KG with semantics ranging from text to attributes, from relational knowledge to logical expressions. We have clearly presented how the resources are constructed, their statistics and formats, and how they can be utilized with cases in evaluating ZSL methods performance and explanations. Our resources are available at https://github.com/China-UK-ZSL/Resources_for_KZSL.
The ability to automatically extract Knowledge Graphs (KG) from a given collection of documents is a long-standing problem in Artificial Intelligence. One way to assess this capability is through the task of slot filling. Given an entity query in for m of [Entity, Slot, ?], a system is asked to `fill the slot by generating or extracting the missing value from a relevant passage or passages. This capability is crucial to create systems for automatic knowledge base population, which is becoming in ever-increasing demand, especially in enterprise applications. Recently, there has been a promising direction in evaluating language models in the same way we would evaluate knowledge bases, and the task of slot filling is the most suitable to this intent. The recent advancements in the field try to solve this task in an end-to-end fashion using retrieval-based language models. Models like Retrieval Augmented Generation (RAG) show surprisingly good performance without involving complex information extraction pipelines. However, the results achieved by these models on the two slot filling tasks in the KILT benchmark are still not at the level required by real-world information extraction systems. In this paper, we describe several strategies we adopted to improve the retriever and the generator of RAG in order to make it a better slot filler. Our KGI0 system (available at https://github.com/IBM/retrieve-write-slot-filling) reached the top-1 position on the KILT leaderboard on both T-REx and zsRE dataset with a large margin.
Incorporating external knowledge to Visual Question Answering (VQA) has become a vital practical need. Existing methods mostly adopt pipeline approaches with different components for knowledge matching and extraction, feature learning, etc.However, s uch pipeline approaches suffer when some component does not perform well, which leads to error propagation and poor overall performance. Furthermore, the majority of existing approaches ignore the answer bias issue -- many answers may have never appeared during training (i.e., unseen answers) in real-word application. To bridge these gaps, in this paper, we propose a Zero-shot VQA algorithm using knowledge graphs and a mask-based learning mechanism for better incorporating external knowledge, and present new answer-based Zero-shot VQA splits for the F-VQA dataset. Experiments show that our method can achieve state-of-the-art performance in Zero-shot VQA with unseen answers, meanwhile dramatically augment existing end-to-end models on the normal F-VQA task.
Node classification is a central task in graph data analysis. Scarce or even no labeled data of emerging classes is a big challenge for existing methods. A natural question arises: can we classify the nodes from those classes that have never been see n? In this paper, we study this zero-shot node classification (ZNC) problem which has a two-stage nature: (1) acquiring high-quality class semantic descriptions (CSDs) for knowledge transfer, and (2) designing a well generalized graph-based learning model. For the first stage, we give a novel quantitative CSDs evaluation strategy based on estimating the real class relationships, so as to get the best CSDs in a completely automatic way. For the second stage, we propose a novel Decomposed Graph Prototype Network (DGPN) method, following the principles of locality and compositionality for zero-shot model generalization. Finally, we conduct extensive experiments to demonstrate the effectiveness of our solutions.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا