ترغب بنشر مسار تعليمي؟ اضغط هنا

Particle Filter Based Monocular Human Tracking with a 3D Cardbox Model and a Novel Deterministic Resampling Strategy

50   0   0.0 ( 0 )
 نشر من قبل Ziyuan Liu
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

The challenge of markerless human motion tracking is the high dimensionality of the search space. Thus, efficient exploration in the search space is of great significance. In this paper, a motion capturing algorithm is proposed for upper body motion tracking. The proposed system tracks human motion based on monocular silhouette-matching, and it is built on the top of a hierarchical particle filter, within which a novel deterministic resampling strategy (DRS) is applied. The proposed system is evaluated quantitatively with the ground truth data measured by an inertial sensor system. In addition, we compare the DRS with the stratified resampling strategy (SRS). It is shown in experiments that DRS outperforms SRS with the same amount of particles. Moreover, a new 3D articulated human upper body model with the name 3D cardbox model is created and is proven to work successfully for motion tracking. Experiments show that the proposed system can robustly track upper body motion without self-occlusion. Motions towards the camera can also be well tracked.

قيم البحث

اقرأ أيضاً

This paper proposes a novel simultaneous localization and mapping (SLAM) approach, namely Attention-SLAM, which simulates human navigation mode by combining a visual saliency model (SalNavNet) with traditional monocular visual SLAM. Most SLAM methods treat all the features extracted from the images as equal importance during the optimization process. However, the salient feature points in scenes have more significant influence during the human navigation process. Therefore, we first propose a visual saliency model called SalVavNet in which we introduce a correlation module and propose an adaptive Exponential Moving Average (EMA) module. These modules mitigate the center bias to enable the saliency maps generated by SalNavNet to pay more attention to the same salient object. Moreover, the saliency maps simulate the human behavior for the refinement of SLAM results. The feature points extracted from the salient regions have greater importance in optimization process. We add semantic saliency information to the Euroc dataset to generate an open-source saliency SLAM dataset. Comprehensive test results prove that Attention-SLAM outperforms benchmarks such as Direct Sparse Odometry (DSO), ORB-SLAM, and Salient DSO in terms of efficiency, accuracy, and robustness in most test cases.
135 - Dapeng Zhao , Jean Oh 2020
We propose a Convolutional Neural Network-based approach to learn, detect,and extract patterns in sequential trajectory data, known here as Social Pattern Extraction Convolution (Social-PEC). A set of experiments carried out on the human trajectory p rediction problem shows that our model performs comparably to the state of the art and outperforms in some cases. More importantly,the proposed approach unveils the obscurity in the previous use of pooling layer, presenting a way to intuitively explain the decision-making process.
We present a new trainable system for physically plausible markerless 3D human motion capture, which achieves state-of-the-art results in a broad range of challenging scenarios. Unlike most neural methods for human motion capture, our approach, which we dub physionical, is aware of physical and environmental constraints. It combines in a fully differentiable way several key innovations, i.e., 1. a proportional-derivative controller, with gains predicted by a neural network, that reduces delays even in the presence of fast motions, 2. an explicit rigid body dynamics model and 3. a novel optimisation layer that prevents physically implausible foot-floor penetration as a hard constraint. The inputs to our system are 2D joint keypoints, which are canonicalised in a novel way so as to reduce the dependency on intrinsic camera parameters -- both at train and test time. This enables more accurate global translation estimation without generalisability loss. Our model can be finetuned only with 2D annotations when the 3D annotations are not available. It produces smooth and physically principled 3D motions in an interactive frame rate in a wide variety of challenging scenes, including newly recorded ones. Its advantages are especially noticeable on in-the-wild sequences that significantly differ from common 3D pose estimation benchmarks such as Human 3.6M and MPI-INF-3DHP. Qualitative results are available at http://gvv.mpi-inf.mpg.de/projects/PhysAware/
3D human pose estimation is still a challenging problem despite the large amount of work that has been done in this field. Generally, most methods directly use neural networks and ignore certain constraints (e.g., reprojection constraints and joint a ngle and bone length constraints). This paper proposes a weakly supervised GAN-based model for 3D human pose estimation that considers 3D information along with 2D information simultaneously, in which a reprojection network is employed to learn the mapping of the distribution from 3D poses to 2D poses. In particular, we train the reprojection network and the generative adversarial network synchronously. Furthermore, inspired by the typical kinematic chain space (KCS) matrix, we propose a weighted KCS matrix, which is added into the discriminators input to impose joint angle and bone length constraints. The experimental results on Human3.6M show that our method outperforms state-of-the-art methods by approximately 5.1%.
91 - Ankit Dhall 2018
We propose a complete pipeline that allows object detection and simultaneously estimate the pose of these multiple object instances using just a single image. A novel keypoint regression scheme with a cross-ratio term is introduced that exploits prio r information about the objects shape and size to regress and find specific feature points. Further, a priori 3D information about the object is used to match 2D-3D correspondences and accurately estimate object positions up to a distance of 15m. A detailed discussion of the results and an in-depth analysis of the pipeline is presented. The pipeline runs efficiently on a low-powered Jetson TX2 and is deployed as part of the perception pipeline on a real-time autonomous vehicle cruising at a top speed of 54 km/hr.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا