ﻻ يوجد ملخص باللغة العربية
Maximum Entropy (MaxEnt) reinforcement learning is a powerful learning paradigm which seeks to maximize return under entropy regularization. However, action entropy does not necessarily coincide with state entropy, e.g., when multiple actions produce the same transition. Instead, we propose to maximize the transition entropy, i.e., the entropy of next states. We show that transition entropy can be described by two terms; namely, model-dependent transition entropy and action redundancy. Particularly, we explore the latter in both deterministic and stochastic settings and develop tractable approximation methods in a near model-free setup. We construct algorithms to minimize action redundancy and demonstrate their effectiveness on a synthetic environment with multiple redundant actions as well as contemporary benchmarks in Atari and Mujoco. Our results suggest that action redundancy is a fundamental problem in reinforcement learning.
Discrete-continuous hybrid action space is a natural setting in many practical problems, such as robot control and game AI. However, most previous Reinforcement Learning (RL) works only demonstrate the success in controlling with either discrete or c
Deep Reinforcement Learning (DRL) methods have performed well in an increasing numbering of high-dimensional visual decision making domains. Among all such visual decision making problems, those with discrete action spaces often tend to have underlyi
In multi-agent reinforcement learning, the inherent non-stationarity of the environment caused by other agents actions posed significant difficulties for an agent to learn a good policy independently. One way to deal with non-stationarity is agent mo
Transfer learning (TL) is a promising way to improve the sample efficiency of reinforcement learning. However, how to efficiently transfer knowledge across tasks with different state-action spaces is investigated at an early stage. Most previous stud
Learning effective policies for sparse objectives is a key challenge in Deep Reinforcement Learning (RL). A common approach is to design task-related dense rewards to improve task learnability. While such rewards are easily interpreted, they rely on