ﻻ يوجد ملخص باللغة العربية
In this work, we introduce our submission to the 2nd Affective Behavior Analysis in-the-wild (ABAW) 2021 competition. We train a unified deep learning model on multi-databases to perform two tasks: seven basic facial expressions prediction and valence-arousal estimation. Since these databases do not contains labels for all the two tasks, we have applied the distillation knowledge technique to train two networks: one teacher and one student model. The student model will be trained using both ground truth labels and soft labels derived from the pretrained teacher model. During the training, we add one more task, which is the combination of the two mentioned tasks, for better exploiting inter-task correlations. We also exploit the sharing videos between the two tasks of the AffWild2 database that is used in the competition, to further improve the performance of the network. Experiment results shows that the network have achieved promising results on the validation set of the AffWild2 database. Code and pretrained model are publicly available at https://github.com/glmanhtu/multitask-abaw-2021
The study of affective computing in the wild setting is underpinned by databases. Existing multimodal emotion databases in the real-world conditions are few and small, with a limited number of subjects and expressed in a single language. To meet this
When recognizing emotions, subtle nuances of emotion displays often cause ambiguity or uncertainty in emotion perception. Unfortunately, the ambiguity or uncertainty cannot be reflected in hard emotion labels. Emotion predictions with uncertainty can
Deep learning models trained on audio-visual data have been successfully used to achieve state-of-the-art performance for emotion recognition. In particular, models trained with multitask learning have shown additional performance improvements. Howev
Human emotions can be inferred from facial expressions. However, the annotations of facial expressions are often highly noisy in common emotion coding models, including categorical and dimensional ones. To reduce human labelling effort on multi-task
In our everyday lives and social interactions we often try to perceive the emotional states of people. There has been a lot of research in providing machines with a similar capacity of recognizing emotions. From a computer vision perspective, most of