Game Development Reference
In-Depth Information
CONDENSATION algorithms used in tracking and a comparison with the
Kalman filters can be found in Isard & Blake (1998).
In Wachter & Nagel (1999), a 3D model composed of right-elliptical cones is
fitted to consecutive frames by means of an iterated extended Kalman filter. A
motion model of constant velocity for all DOFs is used for prediction, while the
update of the parameters is based on a maximum a-posteriori estimation
incorporating edge and region information. This approach is able to cope with
self-occlusions occurring between the legs of a walking person. Self-occlusions
are also tackled in a Bayesian tracking system presented in Howe, Leventon &
Freeman (1999). This system tracks human figures in short monocular se-
quences and reconstructs their motion in 3D. It uses prior information learned
from training data. Training data consists of a vector gathered over 11 succes-
sive frames representing the 3D coordinates of 20 tracked body points and is
used to build a mixture-of-Gaussians probability density model. 3D reconstruc-
tion is achieved by establishing correspondence between the training data and
the features extracted. Sidenbladh, Black & Sigal (2002) also use a probabilistic
approach to address the problem of modeling 3D human motion for synthesis and
tracking. They avoid the high dimensionality and non-linearity of body movement
modeling by representing the posterior distribution non-parametrically. Learning
state transition probabilities is replaced with an efficient probabilistic search in
a large training set. An approximate probabilistic tree-search method takes
advantage of the coefficients of a low-dimensional model and returns a particular
sample human motion.
In contrast to single-view approaches, multiple camera techniques are able to
overcome occlusions and depth ambiguities of the body parts, since useful motion
information missing from one view may be recovered from another view.
A rich set of features is used in Okada, Shirai & Miura (2000) for the estimation
of the 3D translation and rotation of the human body. Foreground regions are
extracted by combining optical flow, depth (which is calculated from a pair of
stereo images) and prediction information. 3D pose estimation is then based on
the position and shape of the extracted region and on past states using Kalman
filtering. The evident problem of pose singularities is tackled probabilistically.
A framework for person tracking in various indoor scenes is presented in Cai &
Aggarwal (1999), using three synchronized cameras. Though there are three
cameras, tracking is actually based on one camera view at a time. When the
system predicts that the active camera no longer provides a sufficient view of
the person, it is deactivated and the camera providing the best view is selected.
Feature correspondence between consecutive frames is achieved using Baye-
sian classification schemes associated with motion analysis in a spatial-temporal
domain. However, this method cannot deal with occlusions above a certain level.
Search WWH ::




Custom Search