Feature extraction by shape matching - Feature Extraction and Image Processing

Image Processing Reference

In-Depth Information

theory assumes that an image replicates spatially to infinity. Such difficulty can be reduced

by using window operators, such as the Hamming or the Hanning windows. These difficulties

do not obtain for optical Fourier transforms and so using the Fourier transform for position-

invariant template matching is often confined to optical implementations.

5.3.3

Discussion of template matching

The advantages associated with template matching are mainly theoretical since it can be

very difficult to develop a template matching technique that operates satisfactorily. The

results presented here have been for position invariance only. If invariance to rotation and

scale is also required then this can cause difficulty. This is because the template is stored

as a discrete set of points. When these are rotated, gaps can appear due to the discrete

nature of the co-ordinate system. If the template is increased in size then again there will

be missing points in the scaled-up version. Again, there is a frequency domain version that

can handle variation in size, since scale invariant template matching can be achieved using

the Mellin transform (Bracewell, 1986). This avoids using many templates to accommodate

the variation in size by evaluating the scale-invariant match in a single pass. The Mellin

transform essentially scales the spatial co-ordinates of the image using an exponential

function. A point is then moved to a position given by a logarithmic function of its original

co-ordinates. The transform of the scaled image is then multiplied by the transform of the

template. The maximum again indicates the best match between the transform and the

image. This can be considered to be equivalent to a change of variable. The logarithmic

mapping ensures that scaling (multiplication) becomes addition. By the logarithmic mapping,

the problem of scale invariance becomes a problem of finding the position of a match.

The Mellin transform only provides scale-invariant matching. For scale and position

invariance, the Mellin transform is combined with the Fourier transform, to give the

Fourier - Mellin transform. The Fourier-Mellin transform has many disadvantages in a

digital implementation, due to the problems in spatial resolution, though there are approaches

to reduce these problems (Altmann, 1984), as well as the difficulties with discrete images

experienced in Fourier transform approaches.

Again, the Mellin transform appears to be much better suited to an optical implementation

(Casasent, 1977), where continuous functions are available, rather than to discrete image

analysis. A further difficulty with the Mellin transform is that its result is independent of

the form factor of the template. Accordingly, a rectangle and a square appear to be the same

to this transform. This implies a loss of information since the form factor can indicate that

an object has been imaged from an oblique angle.

So there are innate difficulties with template matching whether it is implemented directly,

or by transform calculus. For these reasons, and because many shape extraction techniques

require more than just edge or brightness data, direct digital implementations of feature

extraction are usually preferred. This is perhaps also influenced by the speed advantage

that one popular technique can confer over template matching. This is the Hough transform,

which is covered next.

5.4

Hough transform (HT)

5.4.1 Overview

The Hough Transform (HT) (Hough, 1962) is a technique that locates shapes in images. In

Feature Extraction and Image Processing

Search WWH ::

Custom Search

Home