Information Technology Reference
In-Depth Information
in Eq. (6) is the prior associated with each pyramid level and used to
combine the the conditional likelihood for each level, denoted as
Item P
(
l
)
β l . Motivated by
the theory of spatial pyramid matching, we determine the weight
β l from the formu-
lation of a maximum-weight problem [8, 14], which is inversely proportional to cell
width at that level as shown in Eq. (8). Intuitively, we want to penalize likelihood
found in larger cells because they involve increasingly dissimilar features. Taken all
of these into consideration, we calculate a pyramid likelihood as follows:
L
l = 0 β l p ( Y | l )=
L
l = 1
1
2 L p
1
2 L l + 1
p
(
Y
)=
(
Y
|
l
=
0
)+
p l (
Y
|
l
)
(8)
It is worthwhile noting that our motion descriptor has a high dimensionality, e.g. the
dimensions of the descriptor at level 2 is 320. Thus directly using Gaussian Mix-
ture Model would involve the estimation of thousands of parameters. Typically, this
would be time consuming and quite unstable in a higher dimensional space. To avoid
this, we first use Principle Component Analysis (PCA) to reduce the dimensionality
of the motion descriptor for each pyramid level respectively, and then use GMM
to model its distributions. The effectiveness of PCA-GMM has been demonstrated
in [15]. In our work here, GMM is learned by maximizing the likelihood function
using Expectation-Maximization. The number of the Gaussian components is de-
termined automatically using Minimum Description Length criterion [19]. Thus the
modeling process of the pyramid Gaussian Mixture is illustrated in Fig. 3. Once
the parameters ( ˆ
) are estimated, given a motion template m with pyramid motion
descriptor Y t under a people hypothesis h , the likelihood with respect to the learned
human model is calculated as follows:
Θ
ˆ
p
(
m
|
h
,
o
)=
Θ )
= l β l p
p
(
Y t
|
ˆ
(
Y t |
l
,
θ i )
(9)
ˆ
ˆ
=
β l
α i N
(
Y t (
l
)
; u i ,
Σ i )
l
i
I
(
l
)
ˆ
(
ˆ
,
,
)
where
are the estimated parameters of Gaussian mixture. Since Y t is com-
puted based on the a given motion template ( m ) of a hypothesis ( h ) by a static object
class detector ( o ), it is natural to represent p
α
u i
Σ
i
i
ˆ
. In the following
section, we will deploy these annotations to give a clear illustration of the verifica-
tion process using the Bayesian graph model.
Verification. For a bounding box hypothesis, we wish to find the probability of
the presence of an object given its motion template m and appearance measure c ,
p
(
Y t |
Θ )
as p
(
m
|
h
,
o
)
(
o
|
c
,
m
,
h
)
, which is given by the Bayesian rule as follows:
1
Z p
(
|
,
,
)=
(
|
,
)
(
|
,
)
(
|
)
p
o
c
m
h
m
h
o
p
c
h
o
p
h
o
(10)
where Z is the normalization factor. In this model, we assume that the motion m and
the appearance c are conditionally independent. To understand this more clearly, the
directed probability graph model of the Bayesian verification process ( Eq. (10))
is shown in Fig. 4, where the arrows indicate the dependencies between variables.
p
(
m
|
h
,
o
)
is the contribution of the motion within the hypothesis bounding box given
 
Search WWH ::




Custom Search