ﻻ يوجد ملخص باللغة العربية
Reinforcement learning problems are often described through rewards that indicate if an agent has completed some task. This specification can yield desirable behavior, however many problems are difficult to specify in this manner, as one often needs to know the proper configuration for the agent. When humans are learning to solve tasks, we often learn from visual instructions composed of images or videos. Such representations motivate our development of Perceptual Reward Functions, which provide a mechanism for creating visual task descriptions. We show that this approach allows an agent to learn from rewards that are based on raw pixels rather than internal parameters.
In reinforcement learning, we often define goals by specifying rewards within desirable states. One problem with this approach is that we typically need to redefine the rewards each time the goal changes, which often requires some understanding of th
Autonomous agents optimize the reward function we give them. What they dont know is how hard it is for us to design a reward function that actually captures what we want. When designing the reward, we might think of some specific training scenarios,
The design of reward functions in reinforcement learning is a human skill that comes with experience. Unfortunately, there is not any methodology in the literature that could guide a human to design the reward function or to allow a human to transfer
In many real-world tasks, it is not possible to procedurally specify an RL agents reward function. In such cases, a reward function must instead be learned from interacting with and observing humans. However, current techniques for reward learning ma
We present a novel method for learning a set of disentangled reward functions that sum to the original environment reward and are constrained to be independently obtainable. We define independent obtainability in terms of value functions with respect