Image Processing Reference
In-Depth Information
Many techniques have used this relationship to express the matching process as an optimisation
or variational problem (Jordan, 1992). The objective is to find the vector (δ
x , δ
y ) that
minimises the error given by
e x , y = S ( P ( t + 1) x x , y y , P ( t ) x , y ) (4.72)
where S ( ) represents a function that measures the similarity between pixels. As such, the
optimum is given by the displacement that minimises the image differences. There are
alternative measures of similarity that can be used to define the matching cost (Jordan,
1992). For example, we can measure the difference by taking the absolute of the arithmetic
difference. Alternatively, we can consider the correlation or the squared values of the
difference or an equivalent normalised form. In practice, it is difficult to try to establish a
conclusive advantage of a particular measure, since they will perform differently depending
on the kind of image, the kind of noise and the nature of the motion we are observing. As
such, one is free to use any measure as long as it can be justified based on particular
practical or theoretical observations. The correlation and the squared difference will be
explained in more detail in the next chapter when we consider how a template can be
located in an image. We shall see that if we want to make the estimation problem in
Equation 4.72 equivalent to maximum likelihood estimation then we should minimise the
squared error. That is,
e x , y = ( P ( t + 1) x
y - P ( t ) x , y ) 2
(4.73)
In practice, the implementation of the minimisation is extremely prone to error since the
displacement is obtained by comparing intensities of single pixels; it is very likely that the
intensity changes, or that a pixel can be confused with other pixels. In order to improve
performance, the optimisation includes the second assumption presented above. If
neighbouring points move with similar velocity, then we can determine the displacement
by considering not just a single pixel, but pixels in a neighbourhood. Thus,
e
x , y
=
(
P
( + 1)
t
-
P
( )
t
)
2
(4.74)
xy
,
xxyy
+,
+
xy
,
(,)
xy
W
That is, the error in the pixel at position ( x , y ) is measured by comparing all the pixels
( x
) in a window W . This makes the measure more stable by introducing an implicit
smoothing factor. The size of the window is a compromise between noise and accuracy.
Naturally, the automatic selection of the window parameter has attracted some interest
(Kanade, 1994). Another important problem is the amount of computation involved in the
minimisation when the displacement between frames is large. This has motivated the
development of hierarchical implementations. As you can envisage, other extensions have
considered more elaborate assumptions about the speed of neighbouring pixels.
A straightforward implementation of the minimisation of the squared error is presented
in Code 4.20 . This function has a pair of parameters that define the maximum displacement
and the window size. The optimum displacement for each pixel is obtained by comparing
the error for all the potential integer displacements. In a more complex implementation, it
is possible to obtain displacements with sub-pixel accuracy (Lawton, 1983). This is normally
achieved by a post-processing step based on sub-pixel interpolation or by matching surfaces
obtained by fitting the data at the integer positions. The effect of the selection of different
window parameters can be seen in the example shown in Figure 4.37 . Figures 4.37 (a) and
4.37 (b) show an object moving up into a static background (at least for the two frames we
are considering). Figures 4.37 (c), 4.37 (d) and 4.37 (e) show the displacements obtained by
considering windows of increasing size. Here, we can observe that as the size of the
, y
Search WWH ::




Custom Search