Image Processing Reference
In-Depth Information
targeted at vision in Zhou and Chellappa (1992). Support Vector Machines (SVMs) (Vapnik,
1995) are one of the more popular new approaches to data modelling and classification.
Amongst SVMs advantages is excellent
generalisation
capabilty which concerns the ability
to classify correctly samples which are not within feature space used for training. SVMs
are already finding application in texture classification (Kim, 1999).
Also, there are methods aimed to improve classification capability by pruning the data
to remove that which does not contribute to the classification decision.
Principle components
analysis
(the
Karhunen-Loeve
transform
) can reduce dimensionality, orthogonalise and
remove redundant data. There is also
linear discriminant analysis
(also called
canonical
analysis
) to improve class separability, whilst concurrently reducing cluster size (it is
formulated to concurrently minimise the within-class distance and to maximise the between-
class distance). There are also algorithms aimed at choosing a reduced set of features for
classification: feature selection for improved discriminatory ability; a recent comparison
can be found in Jain and Zongker (1997). Alternatively, the basis functionals can be chosen
in such a way as to improve classificatiion capability. Recently, interest in biometrics has
focused on combining different classifiers, such as face and speech, and there are promising
new approaches to accommodate this (Kittler, 1998a) and (Kittler, 1998b).
8.5
Segmentation
In order to
segment
an image according to its texture, we can measure the texture in a
chosen region and then classify it. This is equivalent to
template convolution
but where the
result applied to pixels is the class to which they belong, as opposed to the usual result of
template convolution. Here, we shall use a 7 × 7 template size: the texture measures will
be derived from the 49 points within the template. First though we need data from which
we can make a classification decision, the training data. Naturally, this depends on a
chosen application. Here we shall consider the problem of segmenting the eye image into
regions of
hair
and
skin
.
This is a two class problem for which we need samples of each class, samples of skin
and hair. We will take samples of each of the two classes, in this way the classification
decision is as illustrated in Figure
8.5
. The texture measures are the energy, entropy and
inertia of the co-occurrence matrix of the 7 × 7 region, so the feature space is three-
dimensional. The training data is derived from regions of hair and from regions of skin, as
shown in Figures
8.6
(a) and (b), respectively. The first half of this data is the samples of
hair, the other half is samples of the skin, as required for the
k
-nearest neighbour classifier
of Code
8.5
.
We can then segment the image by classifying each pixel according to the description
obtained from its 7 × 7 region. Clearly, the training samples of each class should be
classified correctly. The result is shown in Figure
8.7
(a). Here, the top left corner is first
(correctly) classified as hair, and the top row of the image is classified as hair until the skin
commences (note that the border inherent in template convolution reappears). In fact,
much of the image appears to be classified as expected. The eye region is classified as hair,
but this is a somewhat arbitrary decision; it is simply that hair is the closest texture feature.
Also, some of the darker regions of skin are classified as hair, perhaps the result of training
on regions of brighter skin.
Naturally, this is a computationally demanding process. An alternative approach is
simply to classify regions as opposed to pixels. This is the tiled approach, with the result