Gait Analysis and Human Motion Tracking - Intelligent Video Event Analysis and Understanding

Information Technology Reference

In-Depth Information

search window, or allows a longer gap between frames, each resulting in greater

efficiency. Further, there is less chance of an incorrect match as the probability of

similarity between detected features is reduced. The nature of the gait model de-

scribed by Eq. (7) determines the use of a nonlinear optimization process. This is

a dependent multivariate process, and the stability of the possible solution is not

guaranteed since too many unknown parameters reside in the optimization [15].

We register the projection of a 3-D “template", yielded by the gait model, with the

2-D feature points in the latest frame. This is an incomplete data problem because

the correspondences are not known accurately apriori as there are incorrect corre-

spondences and errors in position. We expect that a number of outliers may occur in

the corresponding matches. Hence, we derive a global rather than a local similarity

between the actual correspondences and the image projections of the scene structure.

In the presence of incorrect feature correspondence, robust registration and recov-

ery of the camera transformation between frames is explored using a maximum a

posteriori (MAP) strategy, instantiated by the expectation-maximization (EM) algo-

rithm [11]. Hence, we iterate the expectation and maximization until convergence

is reached at the global minimum . The expectation step indicates a posteriori prob-

abilities of the incomplete data using Gaussian mixture models, given the image

observations and belief in motion provided by the gait model. The maximization

step involves a maximum a posteriori estimate to refine the predicted motion pa-

rameters in order to obtain a minimum sum of Euclidean distances between image

points. Our approach is inspired by that used by Cross and Hancock [7], Choi et al.

[5] and Zhou et al. [41]. However, the key difference is the use of the gait model

to predict frame-to-frame camera transformations, thus improving the efficiency of

the approach. In comparison with [41], we have extended our evaluation to include

synthetic and real pedestrian sequences in which walking velocity changes. Further-

more, we have added an experiment investigating the effect of moving obstacles

(other pedestrians) in real image sequences.

4.1

Estimating Motion Parameters by a MAP Strategy

Consider a dynamic representation for the registration, f

(

α t ,

β t ,

φ t )

,where ˜

α t refers

to the 3-D points recovered from corresponding image points,

β t is the image obser-

vation, and ˜

φ t is the current prediction of the camera transformation, based on the

longer term gait model, at time t . Given a good ˜

φ t , the posterior probability (or the

likelihood of the hypothesis, ˜

β t .

Assuming individual image points are conditionally independent [5], the joint

probability is therefore

φ t ), p

(

β t |

φ t )

, is maximized to find an optimal

φ t )= ∏ i

(

β t |

(

β t i |

φ t ) ,

(8)

where i is the index of an image point. Using Bayes' rule,

φ t )= ∑ j

(

β t i |

(

β t i |

α t j ,

φ t )

(

α t j ) ,

(9)

Intelligent Video Event Analysis and Understanding

Search WWH ::

Custom Search

Home