Geoscience Reference
In-Depth Information
In practice, both regional and local control may be employed in the same design. The most
likely combination in such a multistage design would be to exercise regional control via two-stage
cluster sampling and local control via one-stage cluster sampling, as follows. Define the primary
sampling unit as the cluster constructed to obtain regional spatial control (e.g., a 6-
6-km area).
The secondary sampling unit would be chosen to provide the desired local spatial control (e.g., 3
¥
¥
3 block of pixels). The first-stage sample consists of primary sampling units (PSUs), but not
every 3
¥
3 block in each sampled PSU is observed. Rather, a second-stage sample of 3
¥
3 blocks
would be selected from those available in the first-stage sample. The 3
3 blocks would not be
further subsampled; instead, reference data would be obtained for all nine pixels of the 3
¥
3 cluster.
Stratifying by LC class can directly conflict with clustering. The essence of the problem is
illustrated by a simple example. Suppose the clusters are 3
¥
3 blocks of pixels that, when taken
together, partition the mapped region. The majority of these clusters will not consist of nine pixels
all belonging to the same LC class. Stratified sampling directs us to select individual pixels from
each LC class, in opposition to cluster sampling in which the selection protocol is based on a group
of pixels. Because cluster sampling selects groups of pixels, we forfeit the control over the sample
allocation that is sought by stratified sampling. It is possible to sample clusters via a stratified
design, but it is the cluster, not the individual pixel, that must determine stratum membership.
A variety of approaches to circumvent this conflict between stratified and cluster sampling can
be posed. One that should not be considered is to restrict the sample to only homogeneous 3
¥
3
clusters. This approach clearly results in a sample that cannot be considered representative of the
population, and it is well known that sampling only homogeneous areas of the map tends to inflate
accuracy (Hammond and Verbyla, 1996). A second approach, and one that maintains the desired
statistical rigor of the sampling protocol, is to employ two-stage cluster sampling in conjunction
with stratification by LC class. A third approach in which the clusters are redefined to permit
stratified selection will also be described.
The sampling design implemented in the accuracy assessment of the National Land Cover Data
(NLCD) map illustrates how cluster sampling and stratification can be combined to achieve cost-
effectiveness and precise class-specific estimates (Zhu et al., 2000; Yang et al., 2001; Stehman et al.,
2003). The NLCD design was implemented across the U.S. using 10 regional assessments based on
the U.S. Environmental Protection Agency's (EPA) federal administrative regions. Within a single
region, the NLCD assessment was designed to provide regional spatial control and stratification by
LC class. For several regions, the PSU was constructed from nonoverlapping, equal-sized areas of
National Aerial Photography Program (NAPP) photo-frames, and in other regions, the PSU was a 6-
¥
¥
6-km spatial unit. Both PSU constructions were designed to reduce the number of photos that would
need to be purchased for reference data collection. A first-stage sample of PSUs was selected at a
sampling rate of approximately 2.0%. Stratification by LC class was implemented at the second stage
of the design. Mapped LC classes were used to stratify all pixels found within the first-stage sample
PSUs. A simple random sample of pixels from each stratum was then selected, typically with 100
pixels per class. This design proved effective for ensuring that all LC classes, including the rare classes,
were represented adequately so that estimates of user's accuracies were reasonably precise. The
clustering feature implemented to achieve regional control succeeded at reducing costs considerably
.
2.2.2
Flexibility of the NLCD Design
The flexibility of the NLCD design permits other options for selecting a second-stage sample.
An alternative second-stage design could improve precision of the NLCD estimates (Stehman et
al., 2000b), but such improvements are not guaranteed and would be gained at some cost. Precision
for the rare LC classes is the primary consideration. Often the rare-class pixels cluster within a
relatively small number of PSUs. The simple random selection within each class implemented in
the second stage of the NLCD design will result in a sample with representation proportional to
the number of pixels of each class within each PSU. That is, if many of the pixels of a rare class
Search WWH ::




Custom Search