Learning Quadruped Locomotion Policies with Reward Machines


Abstract in English

Legged robots have been shown to be effective in navigating unstructured environments. Although there has been much success in learning locomotion policies for quadruped robots, there is little research on how to incorporate human knowledge to facilitate this learning process. In this paper, we demonstrate that human knowledge in the form of LTL formulas can be applied to quadruped locomotion learning within a Reward Machine (RM) framework. Experimental results in simulation show that our RM-based approach enables easily defining diverse locomotion styles, and efficiently learning locomotion policies of the defined styles.

Download