ﻻ يوجد ملخص باللغة العربية
It has been well demonstrated that inverse reinforcement learning (IRL) is an effective technique for teaching machines to perform tasks at human skill levels given human demonstrations (i.e., human to machine apprenticeship learning). This paper seeks to show that a similar application can be demonstrated with human learners. That is, given demonstrations from human experts inverse reinforcement learning techniques can be used to teach other humans to perform at higher skill levels (i.e., human to human apprenticeship learning). To show this two experiments were conducted using a simple, real-time web game where players were asked to touch targets in order to earn as many points as possible. For the experiment player performance was defined as the number of targets a player touched, irrespective of the points that a player actually earned. This allowed for in-game points to be modified and the effect of these alterations on performance measured. At no time were participants told the true performance metric. To determine the point modifications IRL was applied on demonstrations of human experts playing the game. The results of the experiment show with significance that performance improved over the control for select treatment groups. Finally, in addition to the experiment, we also detail the algorithmic challenges we faced when conducting the experiment and the techniques we used to overcome them.
A significant challenge for the practical application of reinforcement learning in the real world is the need to specify an oracle reward function that correctly defines a task. Inverse reinforcement learning (IRL) seeks to avoid this challenge by in
Reinforcement learning (RL) algorithms usually require a substantial amount of interaction data and perform well only for specific tasks in a fixed environment. In some scenarios such as healthcare, however, usually only few records are available for
We consider the problem of learning to behave optimally in a Markov Decision Process when a reward function is not specified, but instead we have access to a set of demonstrators of varying performance. We assume the demonstrators are classified into
We propose a new approach to inverse reinforcement learning (IRL) based on the deep Gaussian process (deep GP) model, which is capable of learning complicated reward structures with few demonstrations. Our model stacks multiple latent GP layers to le
Reinforcement learning is a promising framework for solving control problems, but its use in practical situations is hampered by the fact that reward functions are often difficult to engineer. Specifying goals and tasks for autonomous machines, such