ﻻ يوجد ملخص باللغة العربية
We examine the use of linear and non-linear dimensionality reduction algorithms for extracting low-rank feature representations for speech emotion recognition. Two feature sets are used, one based on low-level descriptors and their aggregations (IS10) and one modeling recurrence dynamics of speech (RQA), as well as their fusion. We report speech emotion recognition (SER) results for learned representations on two databases using different classification methods. Classification with low-dimensional representations yields performance improvement in a variety of settings. This indicates that dimensionality reduction is an effective way to combat the curse of dimensionality for SER. Visualization of features in two dimensions provides insight into discriminatory abilities of reduced feature sets.
Many mobile applications and virtual conversational agents now aim to recognize and adapt to emotions. To enable this, data are transmitted from users devices and stored on central servers. Yet, these data contain sensitive information that could be
Speech Emotion Recognition (SER) in a single language has achieved remarkable results through deep learning approaches in the last decade. However, cross-lingual SER remains a challenge in real-world applications due to a great difference between the
Self-supervised learning has attracted plenty of recent research interest. However, most works for self-supervision in speech are typically unimodal and there has been limited work that studies the interaction between audio and visual modalities for
Feature subspace selection is an important part in speech emotion recognition. Most of the studies are devoted to finding a feature subspace for representing all emotions. However, some studies have indicated that the features associated with differe
We investigate the performance of features that can capture nonlinear recurrence dynamics embedded in the speech signal for the task of Speech Emotion Recognition (SER). Reconstruction of the phase space of each speech frame and the computation of it