ترغب بنشر مسار تعليمي؟ اضغط هنا

Interpretable Reinforcement Learning Inspired by Piagets Theory of Cognitive Development

52   0   0.0 ( 0 )
 نشر من قبل Peyman Setoodeh
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Endeavors for designing robots with human-level cognitive abilities have led to different categories of learning machines. According to Skinners theory, reinforcement learning (RL) plays a key role in human intuition and cognition. Majority of the state-of-the-art methods including deep RL algorithms are strongly influenced by the connectionist viewpoint. Such algorithms can significantly benefit from theories of mind and learning in other disciplines. This paper entertains the idea that theories such as language of thought hypothesis (LOTH), script theory, and Piagets cognitive development theory provide complementary approaches, which will enrich the RL field. Following this line of thinking, a general computational building block is proposed for Piagets schema theory that supports the notions of productivity, systematicity, and inferential coherence as described by Fodor in contrast with the connectionism theory. Abstraction in the proposed method is completely upon the system itself and is not externally constrained by any predefined architecture. The whole process matches the Neissers perceptual cycle model. Performed experiments on three typical control problems followed by behavioral analysis confirm the interpretability of the proposed method and its competitiveness compared to the state-of-the-art algorithms. Hence, the proposed framework can be viewed as a step towards achieving human-like cognition in artificial intelligent systems.



قيم البحث

اقرأ أيضاً

Machine learning technologies are expected to be great tools for scientific discoveries. In particular, materials development (which has brought a lot of innovation by finding new and better functional materials) is one of the most attractive scienti fic fields. To apply machine learning to actual materials development, collaboration between scientists and machine learning is becoming inevitable. However, such collaboration has been restricted so far due to black box machine learning, in which it is difficult for scientists to interpret the data-driven model from the viewpoint of material science and physics. Here, we show a material development success story that was achieved by good collaboration between scientists and one type of interpretable (explainable) machine learning called factorized asymptotic Bayesian inference hierarchical mixture of experts (FAB/HMEs). Based on material science and physics, we interpreted the data-driven model constructed by the FAB/HMEs, so that we discovered surprising correlation and knowledge about thermoelectric material. Guided by this, we carried out actual material synthesis that led to identification of a novel spin-driven thermoelectric material with the largest thermopower to date.
We present an end-to-end, model-based deep reinforcement learning agent which dynamically attends to relevant parts of its state, in order to plan and to generalize better out-of-distribution. The agents architecture uses a set representation and a b ottleneck mechanism, forcing the number of entities to which the agent attends at each planning step to be small. In experiments with customized MiniGrid environments with different dynamics, we observe that the design allows agents to learn to plan effectively, by attending to the relevant objects, leading to better out-of-distribution generalization.
It is a long-standing goal of artificial intelligence (AI) to be superior to human beings in decision making. Games are suitable for testing AI capabilities of making good decisions in non-numerical tasks. In this paper, we develop a new AI algorithm to play the penny-matching game considered in Shannons mind-reading machine (1953) against human players. In particular, we exploit cognitive hierarchy theory and Bayesian learning techniques to continually evolve a model for predicting human player decisions, and let the AI player make decisions according to the model predictions to pursue the best chance of winning. Experimental results show that our AI algorithm beats 27 out of 30 volunteer human players.
We present a reinforcement learning framework, called Programmatically Interpretable Reinforcement Learning (PIRL), that is designed to generate interpretable and verifiable agent policies. Unlike the popular Deep Reinforcement Learning (DRL) paradig m, which represents policies by neural networks, PIRL represents policies using a high-level, domain-specific programming language. Such programmatic policies have the benefits of being more easily interpreted than neural networks, and being amenable to verification by symbolic methods. We propose a new method, called Neurally Directed Program Search (NDPS), for solving the challenging nonsmooth optimization problem of finding a programmatic policy with maximal reward. NDPS works by first learning a neural policy network using DRL, and then performing a local search over programmatic policies that seeks to minimize a distance from this neural oracle. We evaluate NDPS on the task of learning to drive a simulated car in the TORCS car-racing environment. We demonstrate that NDPS is able to discover human-readable policies that pass some significant performance bars. We also show that PIRL policies can have smoother trajectories, and can be more easily transferred to environments not encountered during training, than corresponding policies discovered by DRL.
Reinforcement learning (RL) agents in human-computer interactions applications require repeated user interactions before they can perform well. To address this cold start problem, we propose a novel approach of using cognitive models to pre-train RL agents before they are applied to real users. After briefly reviewing relevant cognitive models, we present our general methodological approach, followed by two case studies from our previous and ongoing projects. We hope this position paper stimulates conversations between RL, HCI, and cognitive science researchers in order to explore the full potential of the approach.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا