ﻻ يوجد ملخص باللغة العربية
In Virtual, augmented, and mixed reality, the use of hand gestures is increasingly becoming popular to reduce the difference between the virtual and real world. The precise location of the fingertip is essential/crucial for a seamless experience. Much of the research work is based on using depth information for the estimation of the fingertips position. However, most of the work using RGB images for fingertips detection is limited to a single finger. The detection of multiple fingertips from a single RGB image is very challenging due to various factors. In this paper, we propose a deep neural network (DNN) based methodology to estimate the fingertips position. We christened this methodology as an Anchor based Fingertips Position Estimation (ABFPE), and it is a two-step process. The fingertips location is estimated using regression by computing the difference in the location of a fingertip from the nearest anchor point. The proposed framework performs the best with limited dependence on hand detection results. In our experiments on the SCUT-Ego-Gesture dataset, we achieved the fingertips detection error of 2.3552 pixels on a video frame with a resolution of $640 times 480$ and about $92.98%$ of test images have average pixel errors of five pixels.
In this paper, we present a novel approach that uses deep learning techniques for colorizing grayscale images. By utilizing a pre-trained convolutional neural network, which is originally designed for image classification, we are able to separate con
Deep neural networks are applied to a wide range of problems in recent years. In this work, Convolutional Neural Network (CNN) is applied to the problem of determining the depth from a single camera image (monocular depth). Eight different networks a
We present animatable neural radiance fields (animatable NeRF) for detailed human avatar creation from monocular videos. Our approach extends neural radiance fields (NeRF) to the dynamic scenes with human movements via introducing explicit pose-guide
This paper presents a neural network to estimate a detailed depth map of the foreground human in a single RGB image. The result captures geometry details such as cloth wrinkles, which are important in visualization applications. To achieve this goal,
We propose a new single-shot method for multi-person 3D pose estimation in general scenes from a monocular RGB camera. Our approach uses novel occlusion-robust pose-maps (ORPM) which enable full body pose inference even under strong partial occlusion