Hierarchical Policies for Cluttered-Scene Grasping with Latent Plans


Abstract in English

6D grasping in cluttered scenes is a longstanding problem in robotic manipulation. Open-loop manipulation pipelines may fail due to inaccurate state estimation, while most end-to-end grasping methods have not yet scaled to complex scenes with obstacles. In this work, we propose a new method for end-to-end learning of 6D grasping in cluttered scenes. Our hierarchical framework learns collision-free target-driven grasping based on partial point cloud observations. We learn an embedding space to encode expert grasping plans during training and a variational autoencoder to sample diverse grasping trajectories at test time. Furthermore, we train a critic network for plan selection and an option classifier for switching to an instance grasping policy through hierarchical reinforcement learning. We evaluate and analyze our method and compare against several baselines in simulation, and demonstrate that the latent planning can generalize to the real-world cluttered-scene grasping task. Our videos and code can be found at https://sites.google.com/view/latent-grasping .

Download