Optimal Stroke Learning with Policy Gradient Approach for Robotic Table Tennis


Abstract in English

Learning to play table tennis is a challenging task for robots, due to the variety of the strokes required. Current advances in deep Reinforcement Learning (RL) have shown potential in learning the optimal strokes. However, the large amount of exploration still limits the applicability when utilizing RL in real scenarios. In this paper, we first propose a realistic simulation environment where several models are built for the balls dynamics and the robots kinematics. Instead of training an end-to-end RL model, we decompose it into two stages: the balls hitting state prediction and consequently learning the racket strokes from it. A novel policy gradient approach with TD3 backbone is proposed for the second stage. In the experiments, we show that the proposed approach significantly outperforms the existing RL methods in simulation. To cross the domain from simulation to reality, we develop an efficient retraining method and test in three real scenarios with a success rate of 98%.

Download