Advances in Vision-Based Human Body Modeling - 3D Modeling and Animation: Synthesis and Analysis Techniques for the Human Body

Game Development Reference

In-Depth Information

Human Motion Recognition

Human motion recognition may also be achieved by analyzing the extracted 3D

pose parameters. However, because of the extra pre-processing required,

recognition of human motion patterns is usually achieved by exploiting low-level

features (e.g., silhouettes) obtained during tracking.

Continuous human activity (e.g., walking, sitting down, bending) is separated in

Ali & Aggarwal (2001) into individual actions using one camera. In order to

detect the commencement and termination of actions, the human skeleton is

extracted and the angles subtended by the torso, the upper leg and the lower leg,

are estimated. Each action is then recognized based on the characteristic path

that these angles traverse. This technique, though, relies on lateral views of the

human body.

Park & Aggarwal (2000) propose a method for separating and classifying not

one person's actions, but two humans' interactions (shaking hands, pointing at

the opposite person, standing hand-in-hand) in indoor monocular grayscale

images with limited occlusions. The aim is to interpret interactions by inferring

the intentions of the persons. Recognition is independently achieved in each

frame by applying the K-nearest-neighbor classifier to a feature vector, which

describes the interpersonal configuration. In Sato & Aggarwal (2001), human

interaction recognition is also addressed. This technique uses outdoor monocular

grayscale images that may cope with low-quality images, but is limited to

movements perpendicular to the camera. It can classify nine two-person

interactions (e.g., one person leaves another stationary person, two people meet

from different directions). Four features are extracted (the absolute velocity of

each person, their average size, the relative distance and its derivative) from the

trajectory of each person. Identification is based on the feature's similarity to an

interaction model using the nearest mean method.

Action and interaction recognition, such as standing, walking, meeting people and

carrying objects, is addressed by Haritaoglu, Harwood & Davis (1998, 2000). A

real-time tracking system, which is based on outdoor monocular grayscale

images taken from a stationary visible or infrared camera, is introduced.

Grayscale textural appearance and shape information of a person are combined

to a textural temporal template, which is an extension of the temporal templates

defined by Bobick & Davis (1996).

Bobick & Davis (1996) introduced a real-time human activity recognition

method, which is based on a two-component image representation of motion. The

first component (Motion Energy Image, MEI) is a binary image, which displays

where motion has occurred during the movement of the person. The second one

(Motion History Image, MHI) is a scalar image, which indicates the temporal

Search WWH ::

Custom Search

Home