Random Projection in Neural Episodic Control

103 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Daichi Nishio

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Daichi Nishio - Satoshi Yamane

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

End-to-end deep reinforcement learning has enabled agents to learn with little preprocessing by humans. However, it is still difficult to learn stably and efficiently because the learning method usually uses a nonlinear function approximation. Neural Episodic Control (NEC), which has been proposed in order to improve sample efficiency, is able to learn stably by estimating action values using a non-parametric method. In this paper, we propose an architecture that incorporates random projection into NEC to train with more stability. In addition, we verify the effectiveness of our architecture by Ataris five games. The main idea is to reduce the number of parameters that have to learn by replacing neural networks with random projection in order to reduce dimensions while keeping the learning end-to-end.

قيم البحث

132 - Daichi Nishio , Satoshi Yamane 2018

The research on deep reinforcement learning which estimates Q-value by deep learning has been attracted the interest of researchers recently. In deep reinforcement learning, it is important to efficiently learn the experiences that an agent has colle cted by exploring environment. We propose NEC2DQN that improves learning speed of a poor sample efficiency algorithm such as DQN by using good one such as NEC at the beginning of learning. We show it is able to learn faster than Double DQN or N-step DQN in the experiments of Pong.

التعلم الآلي الذكاء الاصطناعي

Episodic Memory Deep Q-Networks

130 - Zichuan Lin , Tianqi Zhao , Guangwen Yang 2018

Reinforcement learning (RL) algorithms have made huge progress in recent years by leveraging the power of deep neural networks (DNN). Despite the success, deep RL algorithms are known to be sample inefficient, often requiring many rounds of interacti on with the environments to obtain satisfactory performance. Recently, episodic memory based RL has attracted attention due to its ability to latch on good actions quickly. In this paper, we present a simple yet effective biologically inspired RL algorithm called Episodic Memory Deep Q-Networks (EMDQN), which leverages episodic memory to supervise an agent during training. Experiments show that our proposed method can lead to better sample efficiency and is more likely to find good policies. It only requires 1/5 of the interactions of DQN to achieve many state-of-the-art performances on Atari games, significantly outperforming regular DQN and other episodic memory based RL algorithms.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

Sample-Efficient Reinforcement Learning with Maximum Entropy Mellowmax Episodic Control

402 - Marta Sarrico , Kai Arulkumaran , Andrea Agostinelli 2019

Deep networks have enabled reinforcement learning to scale to more complex and challenging domains, but these methods typically require large quantities of training data. An alternative is to use sample-efficient episodic control methods: neuro-inspi red algorithms which use non-/semi-parametric models that predict values based on storing and retrieving previously experienced transitions. One way to further improve the sample efficiency of these approaches is to use more principled exploration strategies. In this work, we therefore propose maximum entropy mellowmax episodic control (MEMEC), which samples actions according to a Boltzmann policy with a state-dependent temperature. We demonstrate that MEMEC outperforms other uncertainty- and softmax-based exploration methods on classic reinforcement learning environments and Atari games, achieving both more rapid learning and higher final rewards.

التعلم الآلي الحوسبة العصبية والتطورية التعلم الالي

Memory-Efficient Episodic Control Reinforcement Learning with Dynamic Online k-means

89 - Andrea Agostinelli , Kai Arulkumaran , Marta Sarrico 2019

Recently, neuro-inspired episodic control (EC) methods have been developed to overcome the data-inefficiency of standard deep reinforcement learning approaches. Using non-/semi-parametric models to estimate the value function, they learn rapidly, ret rieving cached values from similar past states. In realistic scenarios, with limited resources and noisy data, maintaining meaningful representations in memory is essential to speed up the learning and avoid catastrophic forgetting. Unfortunately, EC methods have a large space and time complexity. We investigate different solutions to these problems based on prioritising and ranking stored states, as well as online clustering techniques. We also propose a new dynamic online k-means algorithm that is both computationally-efficient and yields significantly better performance at smaller memory sizes; we validate this approach on classic reinforcement learning environments and Atari games.

التعلم الآلي الحوسبة العصبية والتطورية التعلم الالي

Deep Random Projection Outlyingness for Unsupervised Anomaly Detection

119 - Martin Bauw , Santiago Velasco-Forero , Jesus Angulo 2021

Random projection is a common technique for designing algorithms in a variety of areas, including information retrieval, compressive sensing and measuring of outlyingness. In this work, the original random projection outlyingness measure is modified and associated with a neural network to obtain an unsupervised anomaly detection method able to handle multimodal normality. Theoretical and experimental arguments are presented to justify the choice of the anomaly score estimator. The performance of the proposed neural network approach is comparable to a state-of-the-art anomaly detection method. Experiments conducted on the MNIST, Fashion-MNIST and CIFAR-10 datasets show the relevance of the proposed approach.

معالجة الإشارات الذكاء الاصطناعي التعلم الالي