ﻻ يوجد ملخص باللغة العربية
Current deep reinforcement learning (RL) approaches incorporate minimal prior knowledge about the environment, limiting computational and sample efficiency. textit{Objects} provide a succinct and causal description of the world, and many recent works have proposed unsupervised object representation learning using priors and losses over static object properties like visual consistency. However, object dynamics and interactions are also critical cues for objectness. In this paper we propose a framework for reasoning about object dynamics and behavior to rapidly determine minimal and task-specific object representations. To demonstrate the need to reason over object behavior and dynamics, we introduce a suite of RGBD MuJoCo object collection and avoidance tasks that, while intuitive and visually simple, confound state-of-the-art unsupervised object representation learning algorithms. We also highlight the potential of this framework on several Atari games, using our object representation and standard RL and planning algorithms to learn dramatically faster than existing deep RL algorithms.
Although deep reinforcement learning has advanced significantly over the past several years, sample efficiency remains a major challenge. Careful choice of input representations can help improve efficiency depending on the structure present in the pr
While neural sequence learning methods have made significant progress in single-document summarization (SDS), they produce unsatisfactory results on multi-document summarization (MDS). We observe two major challenges when adapting SDS advances to MDS
Reinforcement Learning (RL) has made remarkable achievements, but it still suffers from inadequate exploration strategies, sparse reward signals, and deceptive reward functions. These problems motivate the need for a more efficient and directed explo
Demonstration-guided reinforcement learning (RL) is a promising approach for learning complex behaviors by leveraging both reward feedback and a set of target task demonstrations. Prior approaches for demonstration-guided RL treat every new task as a
In this paper, we present a new class of Markov decision processes (MDPs), called Tsallis MDPs, with Tsallis entropy maximization, which generalizes existing maximum entropy reinforcement learning (RL). A Tsallis MDP provides a unified framework for