Geoscience Reference
In-Depth Information
that is subject to error, and the map's accuracy is characterized by estimates of errors in the
classification process that produced it. For example, the Moderate Resolution Imaging Spectrora-
diometer (MODIS) LC product provides a confidence value for each pixel that measures of how
well the pixel fits the training examples presented to the classifier. Design-based inference uses
statistical principles in which samples are acquired to infer characteristics of a finite population,
such as the pixels in a LC map. The key to this approach is probability-based sampling, in which
the units to be sampled are drawn with known probabilities. Examples include random sampling,
in which all possible sample units have equal probability of being drawn, or stratified random
sampling, in which all possible sample units within a particular stratum have equal probability of
being drawn.
Probability-based samples are used to derive consistent estimates of population parameters that
equal the population parameters when the entire population is included in the sample. Consistent
estimators commonly used in LC mapping from remotely sensed data include the proportion of
pixels correctly classified (global accuracy); “user's accuracy,” which is the probability that a pixel
is truly of a particular cover to which it was classified; and “producer's accuracy,” which is the
probability that a pixel was mapped as a member of a class of which it is truly a member. These
estimators are typically derived from a confusion matrix, which tabulates true class labels with
those assigned on the map according to the sample design.
While design-based inference allows proper calculation of these very useful consistent estima-
tors, it is not without its difficulties. Foremost is the difficulty of verifying the accuracy of the label
assigned to a sampled pixel. In the case of a global map, it is not possible to go to a randomly
assigned location on the Earth's surface. Thus, the accuracy of a label is typically assessed using
finer-resolution remotely sensed data. In this case, accuracy is assessed by photointerpretation,
which is subject to its own error. Registration errors also occur and commonly restrict or negate a
pixel-based assessment strategy.
Another practical problem may lie in the classification scheme itself. Sometimes the LC types
are not mutually exclusive or are difficult to resolve. For example, in the International Geo-
sphere/Biosphere Project (IGBP) legend, permanent wetland may also be forest (Loveland et al.,
1999). Or, the pixel may fall on a golf course. Is it grassland, savanna, agriculture, urban, or built-
up land? A related problem is that of mixed pixels. Where fine-resolution data show a selected
pixel to contain more than one cover class, how is a correct label to be assigned?
Additionally, the classification error structure as assessed by the consistent estimators above
may not be the most useful measure of classification accuracy. Some errors are clearly more
problematic than others. For example, confusing forest with water is probably a more serious error
than confusing open and closed shrubland for many applications. This problem leads to the
development of “fuzzy” accuracies that better meet users' needs (Gopal and Woodcock, 1994).
A final concern is that a design-based sample designed to validate a specific map cannot
necessarily be used to validate another. A proper design-based validation procedure normally calls
for stratified sampling so that accuracies may be established for each class with equal certainty.
With stratified sampling, the probability of selection of all pixels within the same class is equal.
If a stratified sample is overlain on another map, the selected pixels do not retain this property,
thus introducing bias. Whereas an unstratified (random or regular) sample does not suffer from this
problem, very large sample sizes are typically required to gain sufficient samples from small classes
to establish their accuracies with needed precision.
While the foregoing discussion described the major elements for validating LC maps, particu-
larly at the global scale, it is clear that a proper validation plan requires all three. Confidence-
building measures are used at early stages both to refine a map that is under construction and to
characterize the general nature of errors of a specific map product. Model-based inference, imple-
mented during the classification process, can provide users with a quantitative assessment of each
classification decision. Design-based inference, although costly, provides unbiased map accuracy
statements using consistent estimators.
Search WWH ::




Custom Search