Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

3D Head-Position Prediction in First-Person View by Considering Head Pose for Human-Robot Eye Contact

79 0 0.0 ( 0 )

Download Cite

Added by Yasunori Ozaki

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Yuki Tamaru - Yasunori Ozaki - Yuki Okafuji

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

For a humanoid robot to make eye contact to initiate communication with a human, it is necessary to estimate the humans head position.However, eye contact becomes difficult due to the mechanical delay of the robot while the subject with whom the robot is interacting with is moving. Owing to these issues, it is important to perform head-position prediction to mitigate the effect of the delay in the robots motion. Based on the fact that humans turn their heads before changing direction while walking, we hypothesized that the accuracy of three-dimensional(3D) head-position prediction from the first-person view can be improved by considering the head pose into account.We compared our method with the conventional Kalman filter-based method, and found our method to be more accurate. The experimental results show that considering the head pose helps improve the accuracy of 3D head-position prediction.

rate research

Prior-Guided Multi-View 3D Head Reconstruction

84 - Xueying Wang , Yudong Guo , Zhongqi Yang 2021

Recovering a 3D head model including the complete face and hair regions is still a challenging problem in computer vision and graphics. In this paper, we consider this problem with a few multi-view portrait images as input. Previous multi-view stereo methods, either based on the optimization strategies or deep learning techniques, suffer from low-frequency geometric structures such as unclear head structures and inaccurate reconstruction in hair regions. To tackle this problem, we propose a prior-guided implicit neural rendering network. Specifically, we model the head geometry with a learnable signed distance field (SDF) and optimize it via an implicit differentiable renderer with the guidance of some human head priors, including the facial prior knowledge, head semantic segmentation information and 2D hair orientation maps. The utilization of these priors can improve the reconstruction accuracy and robustness, leading to a high-quality integrated 3D head model. Extensive ablation studies and comparisons with state-of-the-art methods demonstrate that our method could produce high-fidelity 3D head geometries with the guidance of these priors.

Computer Vision and Pattern Recognition

Human Activity Recognition using Multi-Head CNN followed by LSTM

119 - Waqar Ahmad , Misbah Kazmi , Hazrat Ali 2020

This study presents a novel method to recognize human physical activities using CNN followed by LSTM. Achieving high accuracy by traditional machine learning algorithms, (such as SVM, KNN and random forest method) is a challenging task because the data acquired from the wearable sensors like accelerometer and gyroscope is a time-series data. So, to achieve high accuracy, we propose a multi-head CNN model comprising of three CNNs to extract features for the data acquired from different sensors and all three CNNs are then merged, which are followed by an LSTM layer and a dense layer. The configuration of all three CNNs is kept the same so that the same number of features are obtained for every input to CNN. By using the proposed method, we achieve state-of-the-art accuracy, which is comparable to traditional machine learning algorithms and other deep neural network algorithms.

Signal Processing Computer Vision and Pattern Recognition Machine Learning

Efficient Multi-robot Exploration via Multi-head Attention-based Cooperation Strategy

123 - Shuqi Liu , Zhaoxia Wu 2019

The goal of coordinated multi-robot exploration tasks is to employ a team of autonomous robots to explore an unknown environment as quickly as possible. Compared with human-designed methods, which began with heuristic and rule-based approaches, learning-based methods enable individual robots to learn sophisticated and hard-to-design cooperation strategies through deep reinforcement learning technologies. However, in decentralized multi-robot exploration tasks, learning-based algorithms are still far from being universally applicable to the continuous space due to the difficulties associated with area calculation and reward function designing; moreover, existing learning-based methods encounter problems when attempting to balance the historical trajectory issue and target area conflict problem. Furthermore, the scalability of these methods to a large number of agents is poor because of the exponential explosion problem of state space. Accordingly, this paper proposes a novel approach - Multi-head Attention-based Multi-robot Exploration in Continuous Space (MAMECS) - aimed at reducing the state space and automatically learning the cooperation strategies required for decentralized multi-robot exploration tasks in continuous space. Computational geometry knowledge is applied to describe the environment in continuous space and to design an improved reward function to ensure a superior exploration rate. Moreover, the multi-head attention mechanism employed helps to solve the historical trajectory issue in the decentralized multi-robot exploration task, as well as to reduce the quadratic increase of action space.

Artificial Intelligence Robotics

Cross View Fusion for 3D Human Pose Estimation

107 - Haibo Qiu , Chunyu Wang , Jingdong Wang 2019

We present an approach to recover absolute 3D human poses from multi-view images by incorporating multi-view geometric priors in our model. It consists of two separate steps: (1) estimating the 2D poses in multi-view images and (2) recovering the 3D poses from the multi-view 2D poses. First, we introduce a cross-view fusion scheme into CNN to jointly estimate 2D poses for multiple views. Consequently, the 2D pose estimation for each view already benefits from other views. Second, we present a recursive Pictorial Structure Model to recover the 3D pose from the multi-view 2D poses. It gradually improves the accuracy of 3D pose with affordable computational cost. We test our method on two public datasets H36M and Total Capture. The Mean Per Joint Position Errors on the two datasets are 26mm and 29mm, which outperforms the state-of-the-arts remarkably (26mm vs 52mm, 29mm vs 35mm). Our code is released at url{https://github.com/microsoft/multiview-human-pose-estimation-pytorch}.

Computer Vision and Pattern Recognition

Herschels view into Miras head

335 - A. Mayer , A. Jorissen , F. Kerschbaum 2011

Herschels PACS instrument observed the environment of the binary system Mira Ceti in the 70 and 160 micron bands. These images reveal bright structures shaped as five broken arcs and fainter filaments in the ejected material of Miras primary star. The overall shape of the IR emission around Mira deviates significantly from the expected alignment with Miras exceptionally high space velocity. The observed broken arcs are neither connected to each other nor are they of a circular shape; they stretch over angular ranges of 80 to 100 degrees. By comparing Herschel and GALEX data, we found evidence for the disruption of the IR arcs by the fast outflow visible in both Halpha and the far UV. Radial intensity profiles are derived, which place the arcs at distances of 6-85 (550 - 8000 AU) from the binary. Miras IR environment appears to be shaped by the complex interaction of Miras wind with its companion, the bipolar jet, and the ISM.

Solar and Stellar Astrophysics

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

3D Head-Position Prediction in First-Person View by Considering Head Pose for Human-Robot Eye Contact

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions