somewhat impractical because of having to collect ground information for the accuracy assessment
at random locations on the ground.
Two difficult problems arise when using random locations: (1) the location can be very difficult
to access and (2) they can only be selected after the classification has been performed. This second
condition limits the accuracy assessment data to being collected late in the project instead of in
conjunction with the training data collection, thereby increasing the costs of the project. In addition,
in some projects the time between the project beginning and the accuracy assessment may be so
long as to cause temporal problems in collecting reference data.
Spatial autocorrelation is said to occur when the presence, absence, or degree of a certain
characteristic affects the presence, absence, or degree of the same characteristic in neighboring
units (Cliff and Ord, 1973). This condition is particularly important in accuracy assessment if
an error at a certain location can be found to influence errors at surrounding locations positively
or negatively (Campbell, 1981). Work by Congalton (1988b) on Landsat MSS data from three
areas of varying spatial diversity (agriculture, range, and forest) showed a positive influence as
much as 30 pixels (1.8 km) away. More recent work by Pugh and Congalton (2001) using
Landsat TM data in a forested environment showed similar issues with spatial autocorrelation.
These results affect the choice of sample size and, especially, the sampling scheme used in the
1.3 CURRENT ISSUES AND NEEDS
The major sampling issue of importance today is the choice of the sample unit. Historically,
a single pixel has often been chosen as the sample unit. However, it is extremely difficult to know
exactly where that pixel is on the reference data, especially when the reference data are generated
on the ground (using field work). Despite recent advances in Global Positioning System (GPS)
technology, it is very rare to achieve adequate location information for a single pixel. Many times
the GPS unit is used under dense forest canopy and the GPS signals are weak or absent. Location
becomes even more problematic with the new high-spatial resolution sensors such as Space Imaging
IKONOS or Digital Globe imagery with pixels as small as 1 m. Also, it is nearly impossible to
match the corners of a pixel on an image to the ground despite our best registration algorithms.
Therefore, using a single pixel as the sampling unit can cause much of the error represented in
the error matrix to be positional error rather than thematic error. Since the goal of the error matrix
is to measure thematic error, it is best to take steps to avoid including positional error. Single
pixels should not be used for the sample unit. Instead, some cluster of pixels or a polygon should
Edge and Boundary Effects
Traditionally, accuracy assessment has been performed to avoid the boundaries between differ-
ent LC classes by taking samples near the center of each polygon, or at least away from the edges.
Avoiding the edges also helps to minimize the locational error as discussed in the last section.
Where exactly to draw the line between different cover types on the ground is very subjective.
Most LC or vegetation maps divide a rather continuous environment called Earth into a number
of discrete categories. The number of categories varies with the objective of the mapping, and our
ability to separate different categories depends on the variability within and between each category.