Do you want to publish a course? Click here

Relation Modeling with Graph Convolutional Networks for Facial Action Unit Detection

113   0   0.0 ( 0 )
 Added by Zhilei Liu
 Publication date 2019
and research's language is English




Ask ChatGPT about the research

Most existing AU detection works considering AU relationships are relying on probabilistic graphical models with manually extracted features. This paper proposes an end-to-end deep learning framework for facial AU detection with graph convolutional network (GCN) for AU relation modeling, which has not been explored before. In particular, AU related regions are extracted firstly, latent representations full of AU information are learned through an auto-encoder. Moreover, each latent representation vector is feed into GCN as a node, the connection mode of GCN is determined based on the relationships of AUs. Finally, the assembled features updated through GCN are concatenated for AU detection. Extensive experiments on BP4D and DISFA benchmarks demonstrate that our framework significantly outperforms the state-of-the-art methods for facial AU detection. The proposed framework is also validated through a series of ablation studies.



rate research

Read More

Attention mechanism has recently attracted increasing attentions in the field of facial action unit (AU) detection. By finding the region of interest of each AU with the attention mechanism, AU-related local features can be captured. Most of the existing attention based AU detection works use prior knowledge to predefine fixed attentions or refine the predefined attentions within a small range, which limits their capacity to model various AUs. In this paper, we propose an end-to-end deep learning based attention and relation learning framework for AU detection with only AU labels, which has not been explored before. In particular, multi-scale features shared by each AU are learned firstly, and then both channel-wise and spatial attentions are adaptively learned to select and extract AU-related local features. Moreover, pixel-level relations for AUs are further captured to refine spatial attentions so as to extract more relevant local features. Without changing the network architecture, our framework can be easily extended for AU intensity estimation. Extensive experiments show that our framework (i) soundly outperforms the state-of-the-art methods for both AU detection and AU intensity estimation on the challenging BP4D, DISFA, FERA 2015 and BP4D+ benchmarks, (ii) can adaptively capture the correlated regions of each AU, and (iii) also works well under severe occlusions and large poses.
Spatio-temporal relations among facial action units (AUs) convey significant information for AU detection yet have not been thoroughly exploited. The main reasons are the limited capability of current AU detection works in simultaneously learning spatial and temporal relations, and the lack of precise localization information for AU feature learning. To tackle these limitations, we propose a novel spatio-temporal relation and attention learning framework for AU detection. Specifically, we introduce a spatio-temporal graph convolutional network to capture both spatial and temporal relations from dynamic AUs, in which the AU relations are formulated as a spatio-temporal graph with adaptively learned instead of predefined edge weights. Moreover, the learning of spatio-temporal relations among AUs requires individual AU features. Considering the dynamism and shape irregularity of AUs, we propose an attention regularization method to adaptively learn regional attentions that capture highly relevant regions and suppress irrelevant regions so as to extract a complete feature for each AU. Extensive experiments show that our approach achieves substantial improvements over the state-of-the-art AU detection methods on BP4D and especially DISFA benchmarks.
This paper describes an approach to the facial action unit (AU) detection. In this work, we present our submission to the Field Affective Behavior Analysis (ABAW) 2021 competition. The proposed method uses the pre-trained JAA model as the feature extractor, and extracts global features, face alignment features and AU local features on the basis of multi-scale features. We take the AU local features as the input of the graph convolution to further consider the correlation between AU, and finally use the fused features to classify AU. The detected accuracy was evaluated by 0.5*accuracy + 0.5*F1. Our model achieves 0.674 on the challenging Aff-Wild2 database.
Current day pain assessment methods rely on patient self-report or by an observer like the Intensive Care Unit (ICU) nurses. Patient self-report is subjective to the individual and suffers due to poor recall. Pain assessment by manual observation is limited by the number of administrations per day and staff workload. Previous studies showed the feasibility of automatic pain assessment by detecting Facial Action Units (AUs). Pain is observed to be associated with certain facial action units (AUs). This method of pain assessment can overcome the pitfalls of present-day pain assessment techniques. All the previous studies are limited to controlled environment data. In this study, we evaluated the performance of OpenFace an open-source facial behavior analysis tool and AU R-CNN on the real-world ICU data. Presence of assisted breathing devices, variable lighting of ICUs, patient orientation with respect to camera significantly affected the performance of the models, although these showed the state-of-the-art results in facial behavior analysis tasks. In this study, we show the need for automated pain assessment system which is trained on real-world ICU data for clinically acceptable pain assessment system.
Facial action unit (AU) detection in the wild is a challenging problem, due to the unconstrained variability in facial appearances and the lack of accurate annotations. Most existing methods depend on either impractical labor-intensive labeling or inaccurate pseudo labels. In this paper, we propose an end-to-end unconstrained facial AU detection framework based on domain adaptation, which transfers accurate AU labels from a constrained source domain to an unconstrained target domain by exploiting labels of AU-related facial landmarks. Specifically, we map a source image with label and a target image without label into a latent feature domain by combining source landmark-related feature with target landmark-free feature. Due to the combination of source AU-related information and target AU-free information, the latent feature domain with transferred source label can be learned by maximizing the target-domain AU detection performance. Moreover, we introduce a novel landmark adversarial loss to disentangle the landmark-free feature from the landmark-related feature by treating the adversarial learning as a multi-player minimax game. Our framework can also be naturally extended for use with target-domain pseudo AU labels. Extensive experiments show that our method soundly outperforms lower-bounds and upper-bounds of the basic model, as well as state-of-the-art approaches on the challenging in-the-wild benchmarks. The code is available at https://github.com/ZhiwenShao/ADLD.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا