ﻻ يوجد ملخص باللغة العربية
Model-free deep reinforcement learning is sample inefficient. One hypothesis -- speculated, but not confirmed -- is that catastrophic interference within an environment inhibits learning. We test this hypothesis through a large-scale empirical study in the Arcade Learning Environment (ALE) and, indeed, find supporting evidence. We show that interference causes performance to plateau; the network cannot train on segments beyond the plateau without degrading the policy used to reach there. By synthetically controlling for interference, we demonstrate performance boosts across architectures, learning algorithms and environments. A more refined analysis shows that learning one segment of a game often increases prediction errors elsewhere. Our study provides a clear empirical link between catastrophic interference and sample efficiency in reinforcement learning.
Motivated by vision-based reinforcement learning (RL) problems, in particular Atari games from the recent benchmark Aracade Learning Environment (ALE), we consider spatio-temporal prediction problems where future (image-)frames are dependent on contr
The powerful learning ability of deep neural networks enables reinforcement learning (RL) agents to learn competent control policies directly from high-dimensional and continuous environments. In theory, to achieve stable performance, neural networks
Intelligent agents need to generalize from past experience to achieve goals in complex environments. World models facilitate such generalization and allow learning behaviors from imagined outcomes to increase sample-efficiency. While learning world m
The ability to learn tasks in a sequential fashion is crucial to the development of artificial intelligence. Neural networks are not, in general, capable of this and it has been widely thought that catastrophic forgetting is an inevitable feature of
Reproducibility in reinforcement learning is challenging: uncontrolled stochasticity from many sources, such as the learning algorithm, the learned policy, and the environment itself have led researchers to report the performance of learned agents us