Geoscience Reference

In-Depth Information

yet it is critical to maintain a large enough sample size so that any analysis performed is statistically

valid. Many researchers, notably Hord and Brooner (1976), van Genderen and Lock (1977), Tortora

(1978), Hay (1979), Rosenfield et al. (1982), and Congalton (1988a), have published equations and

guidelines for choosing the appropriate sample size. The majority of researchers have used an

equation based on the binomial distribution or the normal approximation to the binomial distribution

to compute the required sample size. These techniques are statistically sound for computing the

sample size needed to compute the overall accuracy of a classification or the overall accuracy of

a single category. The equations are based on the proportion of correctly classified samples (pixels,

clusters, or polygons) and on some allowable error. However, these techniques were not designed

to choose a sample size for creating an error matrix. In the case of an error matrix, it is not simply

a matter of correct or incorrect. Given an error matrix with

n

land-cover categories, for a given

1 incorrect answers. Sufficient samples must be

acquired to be able to adequately represent this confusion. Therefore, the use of these techniques

for determining the sample size for an error matrix is not inappropriate. Instead, the use of the

multinomial distribution is recommended (Tortora, 1978).

Traditional thinking about sampling does not often apply because of the large number of pixels

in a remotely sensed image. For example, a 0.5% sample of a single Landsat Thematic Mapper

(TM) scene can be over 300,000 pixels. Most, if not all, assessments should

category there is one correct answer and

n -

be performed on

a per-pixel basis because of problems with exact single pixel location. Practical considerations

more often dictate the sample size selection. A balance between what is statistically sound and

what is practically attainable must be found. A generally accepted rule of thumb is to use a

minimum of 50 samples for each LC category in the error matrix. This rule also tends to agree

with the results of computing sample size using the multinomial distribution (Tortora, 1978). If

the area is especially large or the classification has a large number of LC categories (i.e., more

than 12 categories), the minimum number of samples should be increased to 75 to 100 samples

per category.

The number of samples for each category can also be weighted based on the relative importance

of that category within the objectives of the mapping or on the inherent variability within each of

the categories. Sometimes it is better to concentrate the sampling on the categories of interest and

increase their number of samples while reducing the number of samples taken in the less important

categories. Also, it may be useful to take fewer samples in categories that show little variability,

such as water or forest plantations, and increase the sampling in the categories that are more

variable, such as uneven-aged forests or riparian areas. In summary, the goal is to balance the

statistical recommendations to obtain an adequate sample from which to generate an appropriate

error matrix within the objectives, time, cost, and practical limitations of the mapping project.

Along with sample size, sampling scheme is an important part of any accuracy assessment.

Selection of the proper scheme is absolutely critical to generating an error matrix that is represen-

tative of the entire classified image. Poor choice in sampling scheme can result in significant biases

being introduced into the error matrix that may over- or underestimate the true accuracy. In addition,

the use of the proper sampling scheme may be essential depending on the analysis techniques to

be applied to the error matrix.

Many researchers have expressed opinions about the proper sampling scheme to use, including

everything from simple random sampling to stratified, systematic, unaligned sampling. Despite all

these opinions, very little work has actually been performed in this area. Congalton (1988a)

performed sampling simulations on three spatially diverse areas (forest, agriculture, and rangeland)

and concluded that in all cases simple random sampling without replacement and stratified random

sampling provided satisfactory results. Despite the desirable statistical properties of simple random

sampling, this sampling scheme is not always very practical to apply. Simple random sampling

tends to undersample small but possibly very important areas unless the sample size is significantly

increased. For this reason, stratified random sampling is recommended where a minimum number

of samples are selected from each strata (i.e., category). Even stratified random sampling can be

not