Image Processing Reference
methods, such as k-Nearest Neighbor (kNN). The “standard” kNN methods directly
implement a decision function based on the number of training pixels per class
proportional to the prior probability. This is the prior probability of the occurrence
of w j irrespective of its feature vector, and as such is open to estimation by prior
knowledge external to the remotely sensed image. Typically, ML classifiers assume
prior probabilities to be equal and assign each Pr( w j ) a value of 1.0 (Strahler 1980 ).
However, variations in prior probabilities can be an important step forward for
reducing the detrimental effects of spectrally overlapping classes. If a feature vector
x has probability density values that are significantly different from zero for several
classes, it is not inconceivable for that pixel to belong to any of these classes. When
selecting a class solely on the basis of its spectral characteristics, a large probability
of error frequently results. The use of appropriate prior probabilities, based on
reliable ancillary information, is one way to reduce this error in class assignments.
Moreover, it would seem intuitively more sensible to suggest that some classes are
more likely to occur than others.
Many proprietary software packages (such as ERDAS Imagine, ENVI and
Idrisi) allow the use of prior probabilities, where the user is expected to make esti-
mates by using information on the anticipated (relative) class areas. The increase of
classification accuracy from these 'global priors' is, however, often limited. At the
other extreme, a vector of prior probabilities for each individual pixel is pointless,
because that would be tantamount to a completed classification. A compromise
scale somewhere between the global and individual scales can be derived by first
subdividing the image into strata, or segments, according to ancillary context data,
and then finding the local prior probability vector for each stratum. For example,
Mesev ( 1998 ; 2001 ) used extraneous data from the population census to segment
and classify a Landsat Thematic Mapper (TM) image according to contextual and
Bayesian rules on housing classes (Table 8.1 and Fig. 8.1 ).
Prior probabilities for three housing types (high, medium and low density) are
entered into the ML decision rule at the stratified enumeration district (local) level.
Results in Table 8.1 and Figs. 8.2 and 8.3 are based on comparisons of class area
estimates of classifications of the three housing types generated by equal and strati-
fied unequal prior probabilities. Figure 8.3 shows area estimates for most urban land
use classes produced from the Bayes' modified-ML classifier to be closer to those
derived from the size-ratio transformed census figures. Total absolute error in all
settlements is consistently lower under conditions of unequal as opposed to equal
prior probabilities. However, in terms of housing, there are considerable variations
between types and across the five settlements. No one housing type has consistently
lower area estimation error but there is some evidence to suggest that high den-
sity housing is under-predicted (i.e. less pixels classified), and conversely low density
housing is over-predicted (i.e. more pixels classified). The reason for this may lie in
the highly concentrated nature of British housing in central areas of towns.
The spatial extent of individual houses around the central core may sometimes
be much smaller than the spatial resolution of the satellite sensor. However, what
becomes apparent from these results is that classifications are highly site specific,
and they underline the immense problems that arise when sub-residential classifications