ﻻ يوجد ملخص باللغة العربية
In this work, we introduce metric learning (ML) to enhance the deep embedding learning for text-independent speaker verification (SV). Specifically, the deep speaker embedding network is trained with conventional cross entropy loss and auxiliary pair-based ML loss function. For the auxiliary ML task, training samples of a mini-batch are first arranged into pairs, then positive and negative pairs are selected and weighted through their own and relative similarities, and finally the auxiliary ML loss is calculated by the similarity of the selected pairs. To evaluate the proposed method, we conduct experiments on the Speaker in the Wild (SITW) dataset. The results demonstrate the effectiveness of the proposed method.
Speaker verification (SV) systems using deep neural network embeddings, so-called the x-vector systems, are becoming popular due to its good performance superior to the i-vector systems. The fusion of these systems provides improved performance benef
Multi-task learning (MTL) and attention mechanism have been proven to effectively extract robust acoustic features for various speech-related tasks in noisy environments. In this study, we propose an attention-based MTL (ATM) approach that integrates
Speaker embeddings become growing popular in the text-independent speaker verification task. In this paper, we propose two improvements during the training stage. The improvements are both based on triplet cause the training stage and the evaluation
Learning a good speaker embedding is important for many automatic speaker recognition tasks, including verification, identification and diarization. The embeddings learned by softmax are not discriminative enough for open-set verification tasks. Angu
Meta-learning (ML) has recently become a research hotspot in speaker verification (SV). We introduce two methods to improve the meta-learning training for SV in this paper. For the first method, a backbone embedding network is first jointly trained w