Coherency in One-Shot Gesture Recognition


Abstract in English

Users intentions may be expressed through spontaneous gesturing, which have been seen only a few times or never before. Recognizing such gestures involves one shot gesture learning. While most research has focused on the recognition of the gestures itself, recently new approaches were proposed to deal with gesture perception and production as part of the same problem. The framework presented in this work focuses on learning the process that leads to gesture generation, rather than mining the gestures associated features. This is achieved using kinematic, cognitive and biomechanic characteristics of human interaction. These factors enable the artificial production of realistic gesture samples originated from a single observation. The generated samples are then used as training sets for different state-of-the-art classifiers. Performance is obtained first, by observing the machines gesture recognition percentages. Then, performance is computed by the human recognition from gestures performed by robots. Based on these two scenarios, a composite new metric of coherency is proposed relating to the amount of agreement between these two conditions. Experimental results provide an average recognition performance of 89.2% for the trained classifiers and 92.5% for the participants. Coherency in recognition was determined at 93.6%. While this new metric is not directly comparable to raw accuracy or other pure performance-based standard metrics, it provides a quantifier for validating how realistic the machine generated samples are and how accurate the resulting mimicry is.

Download