An Introduction to Remotely Sensed Data Analysis - Sampling Spatial Units for Agricultural Surveys

Agriculture Reference

In-Depth Information

Some alternative methods have been proposed to reduce the clustering effort. A

very simple partition method (called the single pass method) creates a partitioned

dataset as follows:

• Step 1. Let the first pixel be the centroid of the first cluster.

• Step 2. For the next pixel, calculate its distance, D , from each existing cluster

centroid using some distance measure.

• Step 3. If the lowest calculated D is less than some specified threshold value, add

the pixel to the corresponding cluster and re-determine the centroid; otherwise,

use the pixel to form a new cluster. If any pixels remain to be clustered, return to

Step 2.

As its name implies, this method requires only one pass through the datasets,

which makes it a very efficient clustering method for a serial processor. One

disadvantage is that the resulting clusters depend on the order in which the pixels

are processed. Some possible alternatives to the single pass technique were pro-

posed in (Richards and Jia 2006 ).

Another cluster method that does not require the number of classes to be

predefined is hierarchical clustering. A hierarchical agglomerative classification

can be constructed using the following general algorithm:

• Step 1. Find the two closest pixels and merge them into a cluster.

• Step 2. Find and merge the next two closest points, where a point is either an

individual pixel or a cluster of pixels.

• Step 3. If more than one cluster remains, return to Step 2.

This method produces an output that allows the analyst to decide how many

groups the data should be divided into. This choice is made using a graph called

dendrogram. It represents the merging history, and shows the distances at which the

centroid clusters were merged. Hierarchical cluster methods differ by the definition

used to identify the closest pair of points, and by the means used to describe a newly

merged cluster. The main techniques are: the single link, the complete link, and the

group average methods. However, it is worth noting that hierarchical clustering

algorithms are not often used in practice, because they cannot easily manage large

amounts of data. See Everitt et al. ( 2011 ) for more details about the hierarchical

clustering approach.

Classes can also be identified using a histogram peak selection technique. This is

equivalent to searching for the peaks in a one-dimensional histogram, where a peak

is defined as a value with a greater frequency than its neighbors. After the peaks

have been identified, the pixels are assigned the value of their nearest peak.

Membership to a class is defined by the neighborhoods of a peak. In the case of

broad generalization, a class is defined by having a frequency higher than all of its

neighbors along the same row and down the same column. The fine generalization

type allows for one non-diagonal neighbor with a higher frequency. Clearly this

method is only useful for low dimensional data (Richards and Jia 2006 ).

Search WWH ::

Custom Search

Home