ﻻ يوجد ملخص باللغة العربية
Human body part segmentation refers to the task of predicting the semantic segmentation mask for each body part. Fully supervised body part segmentation methods achieve good performances but require an enormous amount of effort to annotate part masks for training. In contrast to high annotation costs needed for a limited number of part mask annotations, a large number of weak labels such as poses and full body masks already exist and contain relevant information. Motivated by the possibility of using existing weak labels, we propose the first weakly supervised body part segmentation framework. The core idea is first converting the sparse weak labels such as keypoints to the initial estimate of body part masks, and then iteratively refine the part mask predictions. We name the initial part masks estimated from poses the part priors. With sufficient extra weak labels, our weakly supervised framework achieves a comparable performance (62.0% mIoU) to the fully supervised method (63.6% mIoU) on the Pascal-Person-Part dataset. Furthermore, in the extended semi-supervised setting, the proposed framework outperforms the state-of-art methods. Moreover, we extend our proposed framework to other keypoint-supervised part segmentation tasks such as face parsing.
Human body part parsing, or human semantic part segmentation, is fundamental to many computer vision tasks. In conventional semantic segmentation methods, the ground truth segmentations are provided, and fully convolutional networks (FCN) are trained
In this work, we introduce the new scene understanding task of Part-aware Panoptic Segmentation (PPS), which aims to understand a scene at multiple levels of abstraction, and unifies the tasks of scene parsing and part parsing. For this novel task, w
Human activity recognition in videos has been widely studied and has recently gained significant advances with deep learning approaches; however, it remains a challenging task. In this paper, we propose a novel framework that simultaneously considers
We introduce DOPE, the first method to detect and estimate whole-body 3D human poses, including bodies, hands and faces, in the wild. Achieving this level of details is key for a number of applications that require understanding the interactions of t
Co-part segmentation is an important problem in computer vision for its rich applications. We propose an unsupervised learning approach for co-part segmentation from images. For the training stage, we leverage motion information embedded in videos an