ﻻ يوجد ملخص باللغة العربية
Human action recognition is a well-known computer vision and pattern recognition task of identifying which action a man is actually doing. Extracting the keypoint information of a single human with both spatial and temporal features of action sequences plays an essential role to accomplish the task.In this paper, we propose a human action system for Red-Green-Blue(RGB) input video with our own designed module. Based on the efficient Gated Recurrent Unit(GRU) for spatio-temporal feature extraction, we add another sampling module and normalization module to improve the performance of the model in order to recognize the human actions. Furthermore, we build a novel dataset with a similar background and discriminative actions for both human keypoint prediction and behavior recognition. To get a better result, we retrain the pose model with our new dataset to get better performance. Experimental results demonstrate the effectiveness of the proposed model on our own human behavior recognition dataset and some public datasets.
We address human action recognition from multi-modal video data involving articulated pose and RGB frames and propose a two-stream approach. The pose stream is processed with a convolutional model taking as input a 3D tensor holding data from a sub-s
Recognition of human actions and associated interactions with objects and the environment is an important problem in computer vision due to its potential applications in a variety of domains. The most versatile methods can generalize to various envir
Inspired by the observation that humans are able to process videos efficiently by only paying attention where and when it is needed, we propose an interpretable and easy plug-in spatial-temporal attention mechanism for video action recognition. For s
Human activity, which usually consists of several actions, generally covers interactions among persons and or objects. In particular, human actions involve certain spatial and temporal relationships, are the components of more complicated activity, a
Skeleton-based human action recognition has attracted much attention with the prevalence of accessible depth sensors. Recently, graph convolutional networks (GCNs) have been widely used for this task due to their powerful capability to model graph da