Deep Spatiotemporal Representation of the Face for Automatic Pain Intensity Estimation

112 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Mohammad Tavakolian

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Mohammad Tavakolian - Abdenour Hadid

الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Automatic pain intensity assessment has a high value in disease diagnosis applications. Inspired by the fact that many diseases and brain disorders can interrupt normal facial expression formation, we aim to develop a computational model for automatic pain intensity assessment from spontaneous and micro facial variations. For this purpose, we propose a 3D deep architecture for dynamic facial video representation. The proposed model is built by stacking several convolutional modules where each module encompasses a 3D convolution kernel with a fixed temporal depth, several parallel 3D convolutional kernels with different temporal depths, and an average pooling layer. Deploying variable temporal depths in the proposed architecture allows the model to effectively capture a wide range of spatiotemporal variations on the faces. Extensive experiments on the UNBC-McMaster Shoulder Pain Expression Archive database show that our proposed model yields in a promising performance compared to the state-of-the-art in automatic pain intensity estimation.

قيم البحث

120 - Gnana Praveen R , Eric Granger , Patrick Cardinal 2020

Automatic estimation of pain intensity from facial expressions in videos has an immense potential in health care applications. However, domain adaptation (DA) is needed to alleviate the problem of domain shifts that typically occurs between video dat a captured in source and target do-mains. Given the laborious task of collecting and annotating videos, and the subjective bias due to ambiguity among adjacent intensity levels, weakly-supervised learning (WSL)is gaining attention in such applications. Yet, most state-of-the-art WSL models are typically formulated as regression problems, and do not leverage the ordinal relation between intensity levels, nor the temporal coherence of multiple consecutive frames. This paper introduces a new deep learn-ing model for weakly-supervised DA with ordinal regression(WSDA-OR), where videos in target domain have coarse la-bels provided on a periodic basis. The WSDA-OR model enforces ordinal relationships among the intensity levels as-signed to the target sequences, and associates multiple relevant frames to sequence-level labels (instead of a single frame). In particular, it learns discriminant and domain-invariant feature representations by integrating multiple in-stance learning with deep adversarial DA, where soft Gaussian labels are used to efficiently represent the weak ordinal sequence-level labels from the target domain. The proposed approach was validated on the RECOLA video dataset as fully-labeled source domain, and UNBC-McMaster video data as weakly-labeled target domain. We have also validated WSDA-OR on BIOVID and Fatigue (private) datasets for sequence level estimation. Experimental results indicate that our approach can provide a significant improvement over the state-of-the-art models, allowing to achieve a greater localization accuracy.

الرؤية الحاسوبية وتمييز الأنماط

Deep Domain Adaptation for Ordinal Regression of Pain Intensity Estimation Using Weakly-Labelled Videos

87 - Gnana Praveen R , Eric Granger , Patrick Cardinal 2020

Estimation of pain intensity from facial expressions captured in videos has an immense potential for health care applications. Given the challenges related to subjective variations of facial expressions, and operational capture conditions, the accura cy of state-of-the-art DL models for recognizing facial expressions may decline. Domain adaptation has been widely explored to alleviate the problem of domain shifts that typically occur between video data captured across various source and target domains. Moreover, given the laborious task of collecting and annotating videos, and subjective bias due to ambiguity among adjacent intensity levels, weakly-supervised learning is gaining attention in such applications. State-of-the-art WSL models are typically formulated as regression problems, and do not leverage the ordinal relationship among pain intensity levels, nor temporal coherence of multiple consecutive frames. This paper introduces a new DL model for weakly-supervised DA with ordinal regression that can be adapted using target domain videos with coarse labels provided on a periodic basis. The WSDA-OR model enforces ordinal relationships among intensity levels assigned to target sequences, and associates multiple relevant frames to sequence-level labels. In particular, it learns discriminant and domain-invariant feature representations by integrating multiple instance learning with deep adversarial DA, where soft Gaussian labels are used to efficiently represent the weak ordinal sequence-level labels from target domain. The proposed approach was validated using RECOLA video dataset as fully-labeled source domain data, and UNBC-McMaster shoulder pain video dataset as weakly-labeled target domain data. We have also validated WSDA-OR on BIOVID and Fatigue datasets for sequence level estimation.

الرؤية الحاسوبية وتمييز الأنماط

Learning Deep Face Representation

161 - Haoqiang Fan , Zhimin Cao , Yuning Jiang 2014

Face representation is a crucial step of face recognition systems. An optimal face representation should be discriminative, robust, compact, and very easy-to-implement. While numerous hand-crafted and learning-based representations have been proposed , considerable room for improvement is still present. In this paper, we present a very easy-to-implement deep learning framework for face representation. Our method bases on a new structure of deep network (called Pyramid CNN). The proposed Pyramid CNN adopts a greedy-filter-and-down-sample operation, which enables the training procedure to be very fast and computation-efficient. In addition, the structure of Pyramid CNN can naturally incorporate feature sharing across multi-scale face representations, increasing the discriminative ability of resulting representation. Our basic network is capable of achieving high recognition accuracy ($85.8%$ on LFW benchmark) with only 8 dimension representation. When extended to feature-sharing Pyramid CNN, our system achieves the state-of-the-art performance ($97.3%$) on LFW benchmark. We also introduce a new benchmark of realistic face images on social network and validate our proposed representation has a good ability of generalization.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Automatic Estimation of Self-Reported Pain by Interpretable Representations of Motion Dynamics

68 - Benjamin Szczapa , Mohamed Daoudi , Stefano Berretti 2020

We propose an automatic method for pain intensity measurement from video. For each video, pain intensity was measured using the dynamics of facial movement using 66 facial points. Gram matrices formulation was used for facial points trajectory repres entations on the Riemannian manifold of symmetric positive semi-definite matrices of fixed rank. Curve fitting and temporal alignment were then used to smooth the extracted trajectories. A Support Vector Regression model was then trained to encode the extracted trajectories into ten pain intensity levels consistent with the Visual Analogue Scale for pain intensity measurement. The proposed approach was evaluated using the UNBC McMaster Shoulder Pain Archive and was compared to the state-of-the-art on the same data. Using both 5-fold cross-validation and leave-one-subject-out cross-validation, our results are competitive with respect to state-of-the-art methods.

الرؤية الحاسوبية وتمييز الأنماط

Pain Intensity Estimation from Mobile Video Using 2D and 3D Facial Keypoints

61 - Matthew Lee , Lyndon Kennedy , Andreas Girgensohn 2020

Managing post-surgical pain is critical for successful surgical outcomes. One of the challenges of pain management is accurately assessing the pain level of patients. Self-reported numeric pain ratings are limited because they are subjective, can be affected by mood, and can influence the patients perception of pain when making comparisons. In this paper, we introduce an approach that analyzes 2D and 3D facial keypoints of post-surgical patients to estimate their pain intensity level. Our approach leverages the previously unexplored capabilities of a smartphone to capture a dense 3D representation of a persons face as input for pain intensity level estimation. Our contributions are adata collection study with post-surgical patients to collect ground-truth labeled sequences of 2D and 3D facial keypoints for developing a pain estimation algorithm, a pain estimation model that uses multiple instance learning to overcome inherent limitations in facial keypoint sequences, and the preliminary results of the pain estimation model using 2D and 3D features with comparisons of alternate approaches.

الرؤية الحاسوبية وتمييز الأنماط تفاعل الإنسان والحاسوب التعلم الآلي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الوادي الدولية الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Deep Spatiotemporal Representation of the Face for Automatic Pain Intensity Estimation

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً