No Arabic abstract
We examine the utility of implicit user behavioral signals captured using low-cost, off-the-shelf devices for anonymous gender and emotion recognition. A user study designed to examine male and female sensitivity to facial emotions confirms that females recognize (especially negative) emotions quicker and more accurately than men, mirroring prior findings. Implicit viewer responses in the form of EEG brain signals and eye movements are then examined for existence of (a) emotion and gender-specific patterns from event-related potentials (ERPs) and fixation distributions and (b) emotion and gender discriminability. Experiments reveal that (i) Gender and emotion-specific differences are observable from ERPs, (ii) multiple similarities exist between explicit responses gathered from users and their implicit behavioral signals, and (iii) Significantly above-chance ($approx$70%) gender recognition is achievable on comparing emotion-specific EEG responses-- gender differences are encoded best for anger and disgust. Also, fairly modest valence (positive vs negative emotion) recognition is achieved with EEG and eye-based features.
This work explores the utility of implicit behavioral cues, namely, Electroencephalogram (EEG) signals and eye movements for gender recognition (GR) and emotion recognition (ER) from psychophysical behavior. Specifically, the examined cues are acquired via low-cost, off-the-shelf sensors. 28 users (14 male) recognized emotions from unoccluded (no mask) and partially occluded (eye or mouth masked) emotive faces; their EEG responses contained gender-specific differences, while their eye movements were characteristic of the perceived facial emotions. Experimental results reveal that (a) reliable GR and ER is achievable with EEG and eye features, (b) differential cognitive processing of negative emotions is observed for females and (c) eye gaze-based gender differences manifest under partial face occlusion, as typified by the eye and mouth mask conditions.
We examine the utility of implicit behavioral cues in the form of EEG brain signals and eye movements for gender recognition (GR) and emotion recognition (ER). Specifically, the examined cues are acquired via low-cost, off-the-shelf sensors. We asked 28 viewers (14 female) to recognize emotions from unoccluded (no mask) as well as partially occluded (eye and mouth masked) emotive faces. Obtained experimental results reveal that (a) reliable GR and ER is achievable with EEG and eye features, (b) differential cognitive processing especially for negative emotions is observed for males and females and (c) some of these cognitive differences manifest under partial face occlusion, as typified by the eye and mouth mask conditions.
User independent emotion recognition with large scale physiological signals is a tough problem. There exist many advanced methods but they are conducted under relatively small datasets with dozens of subjects. Here, we propose Res-SIN, a novel end-to-end framework using Electrodermal Activity(EDA) signal images to classify human emotion. We first apply convex optimization-based EDA (cvxEDA) to decompose signals and mine the static and dynamic emotion changes. Then, we transform decomposed signals to images so that they can be effectively processed by CNN frameworks. The Res-SIN combines individual emotion features and external emotion benchmarks to accelerate convergence. We evaluate our approach on the PMEmo dataset, the currently largest emotional dataset containing music and EDA signals. To the best of authors knowledge, our method is the first attempt to classify large scale subject-independent emotion with 7962 pieces of EDA signals from 457 subjects. Experimental results demonstrate the reliability of our model and the binary classification accuracy of 73.65% and 73.43% on arousal and valence dimension can be used as a baseline.
Recently, increasing attention has been directed to the study of the speech emotion recognition, in which global acoustic features of an utterance are mostly used to eliminate the content differences. However, the expression of speech emotion is a dynamic process, which is reflected through dynamic durations, energies, and some other prosodic information when one speaks. In this paper, a novel local dynamic pitch probability distribution feature, which is obtained by drawing the histogram, is proposed to improve the accuracy of speech emotion recognition. Compared with most of the previous works using global features, the proposed method takes advantage of the local dynamic information conveyed by the emotional speech. Several experiments on Berlin Database of Emotional Speech are conducted to verify the effectiveness of the proposed method. The experimental results demonstrate that the local dynamic information obtained with the proposed method is more effective for speech emotion recognition than the traditional global features.
This paper describes the details of Sighthounds fully automated age, gender and emotion recognition system. The backbone of our system consists of several deep convolutional neural networks that are not only computationally inexpensive, but also provide state-of-the-art results on several competitive benchmarks. To power our novel deep networks, we collected large labeled datasets through a semi-supervised pipeline to reduce the annotation effort/time. We tested our system on several public benchmarks and report outstanding results. Our age, gender and emotion recognition models are available to developers through the Sighthound Cloud API at https://www.sighthound.com/products/cloud