Digital Signal Processing Reference
In-Depth Information
A System for Video Object Segmentation
75
a fitness
metric
for
the
surface
representation
of a
video object
and
formulate the video object segmentation problem as follows:
arg S min (E vobject
( S , I , M ))
(4..1)
where
E vobject
is
an energy
function,
S
is
the spatio-temporal
surface
that represents the video object,
I
is the video sequence, and
M
is the
a
priori
knowledge of the object.
This energy function has a lower energy
when the surface is consistent
with
I
and
M.
At the highest level,
we split the information into two basic classes:
internal
and
external
energies,
E internal
and
E external ,
respectively.
E vobject
( S , I , M ) =E internal
( S , M ) +E external
( S , I )
(4.2)
External energies are deductive, i.e., the video object membership is
deduced from visual artifacts. Internal energies are inductive, i.e., an
assumption about the video object is made and the surface tries to fit
the assumption. As shown in Eq. 4.2, these components differ in their
functional arguments: internal energies are only dependent upon the
surface and model descriptions; external energies are dependent on the
surface and the video sequence. Internal energies counterbalances the
external energy's tendency to fit (and overfit) the visual artifacts in the
video sequence.
External and internal energies can be further classified by their time
dependence
and
applicability,
respectively.
External
energies
can
be
split
into
two
components,
E image
and
E motion ,
based
upon
whether
they use temporal correlation in their analysis.
E image
does its analysis
piecemeal, i.e., as if each frame is in isolation, while
E motion
treats the
video sequence as a whole in its analysis.
E image ( S ( * , t' ) ,
E external ( S, I )
=
I
( x, y, t )| t=t' ) dt' +
E motion ( S, I )
(4.3)
where
I ( x, y, t )| t=t '
is the frame
t'
of the video sequence.
Internal en-
ergies are split by their dependence upon
a priori
information:
E world ,
that
are
based
upon environmental
assumptions
and
are
valid
for all
video objects, and
E object ,
that are based upon an assumption of object
class:
E internal ( S, M ) =E world ( S ) +E object ( S,M ) (4.4)
While environmental assumptions do not require any information about
the
object,
their
generality
usually
limits
their
power.
Object-based
 
Search WWH ::




Custom Search