Jelly Bean World: A Testbed for Never-Ending Learning

112 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Emmanouil Antonios Platanios

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Emmanouil Antonios Platanios - Abulhair Saparov - Tom Mitchell

التعلم الآلي الذكاء الاصطناعي أنظمة متعددة العملاء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Machine learning has shown growing success in recent years. However, current machine learning systems are highly specialized, trained for particular problems or domains, and typically on a single narrow dataset. Human learning, on the other hand, is highly general and adaptable. Never-ending learning is a machine learning paradigm that aims to bridge this gap, with the goal of encouraging researchers to design machine learning systems that can learn to perform a wider variety of inter-related tasks in more complex environments. To date, there is no environment or testbed to facilitate the development and evaluation of never-ending learning systems. To this end, we propose the Jelly Bean World testbed. The Jelly Bean World allows experimentation over two-dimensional grid worlds which are filled with items and in which agents can navigate. This testbed provides environments that are sufficiently complex and where more generally intelligent algorithms ought to perform better than current state-of-the-art reinforcement learning approaches. It does so by producing non-stationary environments and facilitating experimentation with multi-task, multi-agent, multi-modal, and curriculum learning settings. We hope that this new freely-available software will prompt new research and interest in the development and evaluation of never-ending learning systems and more broadly, general intelligence systems.

قيم البحث

اقرأ أيضاً

NELS - Never-Ending Learner of Sounds

87 - Benjamin Elizalde , Rohan Badlani , Ankit Shah 2018

Sounds are essential to how humans perceive and interact with the world and are captured in recordings and shared on the Internet on a minute-by-minute basis. These recordings, which are predominantly videos, constitute the largest archive of sounds we know. However, most of these recordings have undescribed content making necessary methods for automatic sound analysis, indexing and retrieval. These methods have to address multiple challenges, such as the relation between sounds and language, numerous and diverse sound classes, and large-scale evaluation. We propose a system that continuously learns from the web relations between sounds and language, improves sound recognition models over time and evaluates its learning competency in the large-scale without references. We introduce the Never-Ending Learner of Sounds (NELS), a project for continuously learning of sounds and their associated knowledge, available on line in nels.cs.cmu.edu

أنظمة الصوت في الحاسوب معالجة الصوت والكلام

The never ending search for high temperature superconductivity

108 - T.H. Geballe 2006

A brief history of the discovery of new superconductors is given. Different types of pairing mechanisms are considered. By comparing Tcs in different cuprate families it is concluded that the pairing in the CuO2 layers must be supplemented by interac tions elsewhere in the unit cell. This conclusion is reached simply by considering the significant variations in Tc that are found in structures that have the same sequence of CuO2 layers within the unit cell but have different intervening layers. A quasi-particle is postulated to account for pairing found in the double chain layer of the Pr247 cuprate and may also exist in the CuO2 layers of all the cuprates.

المنصة الفائقة علم المواد

Continuous Coordination As a Realistic Scenario for Lifelong Learning

79 - Hadi Nekoei , Akilesh Badrinaaraayanan , Aaron Courville 2021

Current deep reinforcement learning (RL) algorithms are still highly task-specific and lack the ability to generalize to new environments. Lifelong learning (LLL), however, aims at solving multiple tasks sequentially by efficiently transferring and u sing knowledge between tasks. Despite a surge of interest in lifelong RL in recent years, the lack of a realistic testbed makes robust evaluation of LLL algorithms difficult. Multi-agent RL (MARL), on the other hand, can be seen as a natural scenario for lifelong RL due to its inherent non-stationarity, since the agents policies change over time. In this work, we introduce a multi-agent lifelong learning testbed that supports both zero-shot and few-shot settings. Our setup is based on Hanabi -- a partially-observable, fully cooperative multi-agent game that has been shown to be challenging for zero-shot coordination. Its large strategy space makes it a desirable environment for lifelong RL tasks. We evaluate several recent MARL methods, and benchmark state-of-the-art LLL algorithms in limited memory and computation regimes to shed light on their strengths and weaknesses. This continual learning paradigm also provides us with a pragmatic way of going beyond centralized training which is the most commonly used training protocol in MARL. We empirically show that the agents trained in our setup are able to coordinate well with unseen agents, without any additional assumptions made by previous works. The code and all pre-trained models are available at https://github.com/chandar-lab/Lifelong-Hanabi.

التعلم الآلي الذكاء الاصطناعي أنظمة متعددة العملاء

Reinforcement Learning for Intelligent Healthcare Systems: A Comprehensive Survey

190 - Alaa Awad Abdellatif , Naram Mhaisen , Zina Chkirbene 2021

The rapid increase in the percentage of chronic disease patients along with the recent pandemic pose immediate threats on healthcare expenditure and elevate causes of death. This calls for transforming healthcare systems away from one-on-one patient treatment into intelligent health systems, to improve services, access and scalability, while reducing costs. Reinforcement Learning (RL) has witnessed an intrinsic breakthrough in solving a variety of complex problems for diverse applications and services. Thus, we conduct in this paper a comprehensive survey of the recent models and techniques of RL that have been developed/used for supporting Intelligent-healthcare (I-health) systems. This paper can guide the readers to deeply understand the state-of-the-art regarding the use of RL in the context of I-health. Specifically, we first present an overview for the I-health systems challenges, architecture, and how RL can benefit these systems. We then review the background and mathematical modeling of different RL, Deep RL (DRL), and multi-agent RL models. After that, we provide a deep literature review for the applications of RL in I-health systems. In particular, three main areas have been tackled, i.e., edge intelligence, smart core network, and dynamic treatment regimes. Finally, we highlight emerging challenges and outline future research directions in driving the future success of RL in I-health systems, which opens the door for exploring some interesting and unsolved problems.

التعلم الآلي الذكاء الاصطناعي أنظمة متعددة العملاء

Structured World Belief for Reinforcement Learning in POMDP

121 - Gautam Singh , Skand Peri , Junghyun Kim 2021

Object-centric world models provide structured representation of the scene and can be an important backbone in reinforcement learning and planning. However, existing approaches suffer in partially-observable environments due to the lack of belief sta tes. In this paper, we propose Structured World Belief, a model for learning and inference of object-centric belief states. Inferred by Sequential Monte Carlo (SMC), our belief states provide multiple object-centric scene hypotheses. To synergize the benefits of SMC particles with object representations, we also propose a new object-centric dynamics model that considers the inductive bias of object permanence. This enables tracking of object states even when they are invisible for a long time. To further facilitate object tracking in this regime, we allow our model to attend flexibly to any spatial location in the image which was restricted in previous models. In experiments, we show that object-centric belief provides a more accurate and robust performance for filtering and generation. Furthermore, we show the efficacy of structured world belief in improving the performance of reinforcement learning, planning and supervised reasoning.

التعلم الآلي الذكاء الاصطناعي