Reinforcement Learning for Robotic Manipulation using Simulated Locomotion Demonstrations


Abstract in English

Learning robotic manipulation through reinforcement learning (RL) using only sparse reward signals is still considered a largely unsolved problem. Leveraging human demonstrations can make the learning process more sample efficient, but obtaining high-quality demonstrations can be costly or unfeasible. In this paper we propose a novel approach that introduces object-level demonstrations, i.e. examples of where the objects should be at any state. These demonstrations are generated automatically through RL hence require no expert knowledge. We observe that, during a manipulation task, an object is moved from an initial to a final position. When seen from the point of view of the object being manipulated, this induces a locomotion task that can be decoupled from the manipulation task and learnt through a physically-realistic simulator. The resulting object-level trajectories, called simulated locomotion demonstrations (SLDs), are then leveraged to define auxiliary rewards that are used to learn the manipulation policy. The proposed approach has been evaluated on 13 tasks of increasing complexity, and has been demonstrated to achieve higher success rate and faster learning rates compared to alternative algorithms. SLDs are especially beneficial for tasks like multi-object stacking and non-rigid object manipulation.

Download