Agriculture Reference
In-Depth Information
and the city block distance (also called the L1 or Manhattan distance) is defined as
X
M
d CB x 1 ;
ð
x 2
Þ¼
j
x 1 m
x 2 m
j;
ð
4
:
18
Þ
1
where M is the number of spectral bands.
It is possible to determine clusters of pixels in the image using a distance
measure. We use the sum of the squared error ( SSE ) as the objective function,
which measures the quality of a clustering. In other words, we calculate the error of
each point (its Euclidean distance to the closest centroid) and then compute the total
sum of the squared errors
X
X
t x
SSE
¼
ð
x
ʼ i
Þ
ð
ʼ i
Þ;
ð
4
:
19
Þ
C i
x
2
C i
where C i is the i -th cluster and
ʼ i is its mean vector. SSE has a theoretical minimum
of zero, where all clusters contain a single data point.
The objective function in Eq. ( 4.19 ) can be minimized using an iterative
procedure known as K -means or migrating means. The K -means method
(MacQueen 1967 ) is one of the simplest unsupervised learning algorithms. This
procedure classifies a given data set using an a priori fixed number of clusters, C .
The algorithm consists of the following steps:
• Step 1. Place the C points into a multispectral space. These points represent the
initial group centroids.
• Step 2. Assign each pixel to the group that has the closest centroid.
• Step 3. When all pixels have been assigned, recalculate the positions of the
C centroids.
• Step 4. Repeat Steps 2 and 3 until the centroids no longer move. This separates
the objects into groups.
When the centroids are randomly initialized, different runs of the K -means
algorithm typically produce different values for the objective function. Choosing
the proper initial centroids is key to the basic K -means procedure. A technique that
is commonly used to address this problem is to perform multiple runs, each with a
different set of randomly chosen initial centroids. The set of clusters with the
minimum SSE is then chosen as the solution. For other possible initial values for
the centroids, see Everitt et al. ( 2011 ). Obviously, the principal limitation of this
method is the need to pre-specify the number of clusters. This choice also influences
the computational effort of the procedure.
One possible modification of the K -means algorithm is the iterative self-
organizing data analysis technique (ISODATA, Ball and Hall 1965 ). In addition
to K -means, this algorithm merges the clusters if their separation distance in
multispectral space is less than a specified value, and partitions a single cluster
into two clusters if a splitting condition is satisfied.
Search WWH ::




Custom Search