Geoscience Reference
In-Depth Information
are found in only a few first-stage PSUs, many of the 100 second-stage sample pixels would fall
within these same few PSUs. This clustering could result in poor precision for the estimated accuracy
of this class. Ameliorating this concern is the fact that the NLCD clustering is at the regional level
of control. The PSUs were large (e.g., 6
6 km), so pixels sampled within the same PSU will not
necessarily exhibit strong intracluster correlation. In the case of weak intracluster correlation of
classification error, cluster sampling will not result in precision significantly different from a simple
random sample of the same size (Cochran, 1977).
Two alternatives may counter the clustering effect for rare-class pixels. One is to select a single
pixel at random from 100 first-stage PSUs containing at least one pixel of the rare class. If the class
is present in more than 100 PSUs, the first-stage PSUs could be subsampled to reduce the eligible
set to 100. If fewer than 100 PSUs contain the rare class, the more likely scenario, the situation is
slightly more complicated. A fixed number of pixels may be sampled from each first-stage PSU
containing the rare class so that the total sample size for the rare class is maintained at 100. The
complication is choosing the sample size for each PSU. This will depend on the number of eligible
first-stage PSUs, and also on the number of pixels of the class in the PSU. This design option
counters the potential clustering effect of rare-class pixels by forcing the second-stage sample to be
widely dispersed among the eligible first-stage PSUs. In contrast to the outcome of the NLCD, PSUs
containing a large proportion of the rare class will not receive the majority of the second-stage sample.
The second option to counter clustering of the sample into a few PSUs is to construct a “self-
weighting” design (i.e., an equal probability sampling design in which all pixels have the same
probability of being included in the sample). The term
¥
arises from the fact that the
analysis requires no weighting to account for different inclusion probabilities. At the first stage,
100 sample PSUs would be selected with inclusion probability proportional to the number of pixels
of the specified rare class in the PSU. A wide variety of probability proportional to size designs
exists, but simplicity would be the primary consideration when selecting the design for an accuracy
assessment application. At the second stage, one pixel would be selected per PSU. A consequence
of this two-stage protocol is that within each LC stratum, each pixel has an equal probability of
being included in the sample (Sarndal et al., 1992), so no individual pixel weighting is needed for
the user accuracy estimates. The design goal of distributing the sample pixels among 100 PSUs is
also achieved.
self-weighting
2.2.3
Comparison of the Three Options
Three criteria will be used to compare the NLCD design alternatives: (1) ease of implementation,
(2) simplicity of analysis, and (3) precision. The actual NLCD design will be designated as “Option
1,” sampling one pixel from each of 100 PSUs will be “Option 2,” and the self-weighting design
will be referred to as “Option 3.” Options 1 and 2 are the easiest to implement, and Option 3 is
the most complicated because of the potentially complex, unequal probability first-stage protocol.
Not only would such a first-stage design be more complex than what is typically done in accuracy
assessment, Option 3 requires much more effort because we need the number of pixels of each LC
class within each PSU in the regio
n.
Options 1 and 3 share the characteristic of being self-weighting within LC strata. Self-weighting
designs are simpler to analyze, although survey sampling computational software would mitigate
this analysis advantage. Option 2 is not self-weighting, as demonstrated by the following example.
Suppose a first-stage PSU has 1,000 pixels of the rare class and another PSU has 20 pixels of this
class. At the first stage under Option 2, both PSUs have an equal chance of being selected. At the
second stage, a pixel in the first PSU has a probability of 1/1000 of being chosen, whereas a pixel
in the second PSU has a 1/20 chance of being sampled. Clearly, the probability of a pixel's being
included in the sample is dependent upon how many other pixels of that class are found within the
PSU. The appropriate estimation weights can be derived for this unequal probability design, but
the analysis is complicated.
Search WWH ::




Custom Search