Image Processing Reference
In-Depth Information
theory assumes that an image replicates spatially to infinity. Such difficulty can be reduced
by using window operators, such as the Hamming or the Hanning windows. These difficulties
do not obtain for optical Fourier transforms and so using the Fourier transform for position-
invariant template matching is often confined to optical implementations.
5.3.3
Discussion of template matching
The advantages associated with template matching are mainly theoretical since it can be
very difficult to develop a template matching technique that operates satisfactorily. The
results presented here have been for position invariance only. If invariance to rotation and
scale is also required then this can cause difficulty. This is because the template is stored
as a discrete set of points. When these are rotated, gaps can appear due to the discrete
nature of the co-ordinate system. If the template is increased in size then again there will
be missing points in the scaled-up version. Again, there is a frequency domain version that
can handle variation in size, since scale invariant template matching can be achieved using
the Mellin transform (Bracewell, 1986). This avoids using many templates to accommodate
the variation in size by evaluating the scale-invariant match in a single pass. The Mellin
transform essentially scales the spatial co-ordinates of the image using an exponential
function. A point is then moved to a position given by a logarithmic function of its original
co-ordinates. The transform of the scaled image is then multiplied by the transform of the
template. The maximum again indicates the best match between the transform and the
image. This can be considered to be equivalent to a change of variable. The logarithmic
mapping ensures that scaling (multiplication) becomes addition. By the logarithmic mapping,
the problem of scale invariance becomes a problem of finding the position of a match.
The Mellin transform only provides scale-invariant matching. For scale and position
invariance, the Mellin transform is combined with the Fourier transform, to give the
Fourier - Mellin transform. The Fourier-Mellin transform has many disadvantages in a
digital implementation, due to the problems in spatial resolution, though there are approaches
to reduce these problems (Altmann, 1984), as well as the difficulties with discrete images
experienced in Fourier transform approaches.
Again, the Mellin transform appears to be much better suited to an optical implementation
(Casasent, 1977), where continuous functions are available, rather than to discrete image
analysis. A further difficulty with the Mellin transform is that its result is independent of
the form factor of the template. Accordingly, a rectangle and a square appear to be the same
to this transform. This implies a loss of information since the form factor can indicate that
an object has been imaged from an oblique angle.
So there are innate difficulties with template matching whether it is implemented directly,
or by transform calculus. For these reasons, and because many shape extraction techniques
require more than just edge or brightness data, direct digital implementations of feature
extraction are usually preferred. This is perhaps also influenced by the speed advantage
that one popular technique can confer over template matching. This is the Hough transform,
which is covered next.
5.4
Hough transform (HT)
5.4.1 Overview
The Hough Transform (HT) (Hough, 1962) is a technique that locates shapes in images. In
 
Search WWH ::




Custom Search