CVC: Contrastive Learning for Non-parallel Voice Conversion

نشر في Tingle Li بتاريخ 2020 في مجال الهندسة المعلوماتية والبحث باللغة English تحميل البحث

الملخص بالإنكليزية

Cycle consistent generative adversarial network (CycleGAN) and variational autoencoder (VAE) based models have gained popularity in non-parallel voice conversion recently. However, they often suffer from difficult training process and unsatisfactory results. In this paper, we propose CVC, a contrastive learning-based adversarial approach for voice conversion. Compared to previous CycleGAN-based methods, CVC only requires an efficient one-way GAN training by taking the advantage of contrastive learning. When it comes to non-parallel one-to-one voice conversion, CVC is on par or better than CycleGAN and VAE while effectively reducing training time. CVC further demonstrates superior performance in many-to-one voice conversion, enabling the conversion from unseen speakers.

تحميل البحث