ﻻ يوجد ملخص باللغة العربية
Predicting future human motion plays a significant role in human-machine interactions for a variety of real-life applications. In this paper, we build a deep state-space model, DeepSSM, to predict future human motion. Specifically, we formulate the human motion system as the state-space model of a dynamic system and model the motion system by the state-space theory, offering a unified formulation for diverse human motion systems. Moreover, a novel deep network is designed to build this system, enabling us to utilize both the advantages of deep network and state-space model. The deep network jointly models the process of both the state-state transition and the state-observation transition of the human motion system, and multiple future poses can be generated via the state-observation transition of the model recursively. To improve the modeling ability of the system, a unique loss function, ATPL (Attention Temporal Prediction Loss), is introduced to optimize the model, encouraging the system to achieve more accurate predictions by paying increasing attention to the early time-steps. The experiments on two benchmark datasets (i.e., Human3.6M and 3DPW) confirm that our method achieves state-of-the-art performance with improved effectiveness. The code will be available if the paper is accepted.
Human motion prediction is a challenging and important task in many computer vision application domains. Existing work only implicitly models the spatial structure of the human skeleton. In this paper, we propose a novel approach that decomposes the
In this paper, we propose a novel Transformer-based architecture for the task of generative modelling of 3D human motion. Previous works commonly rely on RNN-based models considering shorter forecast horizons reaching a stationary and often implausib
Human motion prediction from historical pose sequence is at the core of many applications in machine intelligence. However, in current state-of-the-art methods, the predicted future motion is confined within the same activity. One can neither generat
The task of predicting human motion is complicated by the natural heterogeneity and compositionality of actions, necessitating robustness to distributional shifts as far as out-of-distribution (OoD). Here we formulate a new OoD benchmark based on the
Human motion prediction, which aims at predicting future human skeletons given the past ones, is a typical sequence-to-sequence problem. Therefore, extensive efforts have been continued on exploring different RNN-based encoder-decoder architectures.