Database Reference
In-Depth Information
Chapter 11
Motion Database Retrieval with Application
to Gesture Recognition in a Virtual Reality
Dance Training System
Abstract This chapter presents gesture recognition methods and their application
to a dance training system in an instructional, virtual reality (VR) setting. The pro-
posed system is based on the unsupervised parsing of dance movement into a
structured posture space using the spherical self-organizing map (SSOM). A unique
feature descriptor is obtained from the gesture trajectories through posture space on
the SSOM. For recognition, various methods are explored for trajectory analysis,
which include sparse coding, posture occurrence, posture transition, and the hidden
Markov model. Within the system, the dance sequence of a student can be
segmented online and cross-referenced against a library of gestural components
performed by the teacher. This facilitates the assessment of the student dance, as
well as provides visual feedback for effective training.
11.1
Introduction
Recent trends toward more immersive and interactive computing come with increas-
ing demand for more accurate tools to understand and interpret human gestural input
or gesture recognition . Gestures are expressive, meaningful body motions involving
physical movements of the fingers, hands, arms, head, face, or body with the intent
of (1) conveying meaningful information or (2) interacting with the environment.
In a virtual reality dance training system, the issue in recognition is the comparison
of motion data captured in real-time of the trainee against the reference (trainer)
data. Applied to dance training, human action recognition algorithms have been
used in automated assessment of dance performance [ 334 ], visual comparison of
virtual characters [ 335 ], and synthesis of dance partners [ 336 ] for various important
applications. This chapter presents a method to address these issues based on two
techniques: the self-organizing spherical map (SSOM) and transition analysis of the
trajectory on the map.
Human gestures are temporal data; context relates to the states that have led to
(or follow) the state in the present time step. Thus, the collection of states and their
layout can be indicative of some meaning for gesture recognition [ 339 , 340 ]. Recent
studies have attempted to analyze the temporal data to see how it maps onto self-
organizing maps (SOMs) [ 338 , 341 , 352 ]. The idea behind applying SOMs to the
problem of gesture recognition is to deal with the challenge of how to effectively
Search WWH ::




Custom Search