Real-Time Analysis of Human Body Parts and Gesture-Activity Recognition in 3D - 3D Modeling and Animation: Synthesis and Analysis Techniques for the Human Body

Game Development Reference

In-Depth Information

on the characteristic of the application running in the system. For this purpose,

we focus on a specific algorithm, our proposed 3D human detection/activity

recognition system, and evaluate some extended aspects that are presented in

this section.

Algorithmic issues

In the Section “3D Human Detection and Activity Recognition Techniques,” we

presented previous work on the basic steps of stereo vision algorithms and their

real-time applicability for different applications. In general, we can divide 3D

human detection and activity recognition methods into two categories (Cheung

et al., 2000): off-line methods, where the algorithms focus on detailed model

reconstruction (e.g., wire-frame generation), and real-time methods with global

3D human model reconstruction (Bregler & Malik, 1998; Delamarre & Faugeras,

2001).

The major challenge in many 3D applications is to compute dense range data at

high frame rates, since participants cannot easily communicate if the processing

cycle or network latencies are long. As an example of non-real-time methods,

we can give Mulligan et al.'s (2001) work. In their work, to achieve the required

speed and accuracy, Mulligan et al. use a matching algorithm based on the sum

of modified, normalized cross-correlations, and sub-pixel disparity interpolation.

To increase speed, they use Intel IPL functions in the pre-processing steps of

background subtraction and image rectification, as well as a four-processor

parallelization. The authors can only achieve a speed of 2-3 frames per second.

Another non-real-time method (Kakadiaris & Metaxas, 1995) has been pre-

sented in the previous section.

Most of the real-time methods use a generic 3D human model and fit the

projected model to the projected silhouette features. Another silhouette-based

method is proposed by Cheung et al. (2000) and, recently, by Luck et al. (2002),

where the human model is fit in real-time and in the 3D domain. The first method

can reach a speed of 15 frames per second, whereas the second one runs at 20

frames per second. The speed of the systems highly depend on the voxel

resolution. None of these methods tried to use 2D information obtained from

each camera and combine the high-level information, e.g., head, torso, hand

locations and activities, as well as the low-level information, e.g., ellipse

parameters, to generate a global 3D model of the human body parts and

recognize their activities in 3D. 2D information in terms of human image position

and body labeling information is a very valuable component for higher level

modules. In our system, it forms the basis for constructing the 3D body and

activity model.

Search WWH ::

Custom Search

Home