Self-supervised Deformation Modeling for Facial Expression Editing

published by ShahRukh Athar in 2019 in Informatics Engineering and research's language is English Download

Abstract in English

Recent advances in deep generative models have demonstrated impressive results in photo-realistic facial image synthesis and editing. Facial expressions are inherently the result of muscle movement. However, existing neural network-based approaches usually only rely on texture generation to edit expressions and largely neglect the motion information. In this work, we propose a novel end-to-end network that disentangles the task of facial editing into two steps: a motion-editing step and a texture-editing step. In the motion-editing step, we explicitly model facial movement through image deformation, warping the image into the desired expression. In the texture-editing step, we generate necessary textures, such as teeth and shading effects, for a photo-realistic result. Our physically-based task-disentanglement system design allows each step to learn a focused task, removing the need of generating texture to hallucinate motion. Our system is trained in a self-supervised manner, requiring no ground truth deformation annotation. Using Action Units [8] as the representation for facial expression, our method improves the state-of-the-art facial expression editing performance in both qualitative and quantitative evaluations.

Download