Image Processing Reference
In-Depth Information
the least distance) and their mode, the maximally represented class , is attributed to the
sample. In Figure 8.5 , the 3-nearest neighbour is actually Class B since the three nearest
samples contain one from Class A (its nearest neighbour) and two from Class B. Since
there are two elements of Class B, then the sample is attributed to this class by the 3-
nearest neighbour rule. As such, selection from more than one point introduces a form of
feature space smoothing and allows the classification decision not to be affected by noisy
outlier points. Clearly, this smoothing has greater effect for larger values of k . (Further
details concerning a more modern view of the k -nearest neighbour rule can be found in
Michie et al. (1994).
A Mathcad implementation of the k- nearest neighbour rule is given in Code 8.5 . The
arguments are test (the vector of measurements of the test sample), data (the list of
vectors of measurements of all samples), size (the value of k ) and no . The final parameter
no dictates the structure of the presented data and is the number of classes within that data.
The training data is presumed to have been arranged so that samples of each class are all
stored together. For two classes in the training data, no = 2, where each occupies one-half
(the same situation as in Figure 8.5 ). If no = 3 then there are three classes, each occupying
one-third of the complete data set and the first third contains the first class, the second third
contains samples of another class whilst the remaining third contains samples of the final
class. In application, first the distances between the current sample, test , and all other
samples are evaluated by using the function distance . Then the k nearest neighbours are
selected to form a vector of distances min , these are the k neighbours which are closest (in
the feature space) to the sample test. The number of feature space splits fsp is the spacing
between the classes in the data. The class which occurs the most number of times in the
set of size nearest neighbours is then returned as the k- nearest neighbour, by incrementing
the class number to which each of the k neighbours is associated. (If no such decision is
possible, i.e. there is no maximally represented class, then the technique can be arranged
to return the class of the nearest neighbour, by default.)
k_nn(test,data,size,no):=
for i 0..rows(data)-1
dist i 0
for j 0..cols(date)-1
dist i distance(test,data,i)
for i 0..size-1
posmin
coord(min(dist),dist)
dis posmin
max(dist)+1
min i
posmin
rows(data)
no
fsp
for j 1..no
class j 0
for i 0..size-1
for j 1..no
class j
class j +1 if [min i (j-1)·fsp]·(min i <j·fsp)
test_class
coord(max(class),class)
test_class
Code 8.5
Implementing the k -nearest neighbour rule
Search WWH ::




Custom Search