Geoscience Reference
In-Depth Information
If the sample sizes from the different strata are taken in proportion to the
strata sizes, then this is called stratified sampling with proportional alloca-
tion. The samples are self-weighting in the sense that the estimates of the
overall mean and the overall proportion are the same as what is obtained by
lumping the results from all the strata together as a single sample. However,
the simple random sampling variance formulas are not correct, and the strat-
ified random sampling variance formulas should be used instead.
Although proportional allocation is often used because it is convenient,
it is not necessarily the most efficient use of resources. One result applies
if the total cost of a survey consists of a fixed cost F , and costs that are pro-
portional to sample sizes in strata so that Total Cost = F + Σ c i n i , where c i is
the cost of sampling one unit from stratum i . Then, it can be shown that to
either (1) achieve a given level of precision for estimating the overall popula-
tion mean at the least cost or (2) to gain the maximum precision for a fixed
total cost, the sample size in the i th stratum should be made proportional to
σ Nc
i i i . Use of this result requires approximate values for strata variances
and knowledge of sampling costs. If these variances and costs are the same
in all strata, then proportional allocation is optimal. For more details about
how to apply the result, and optimum stratified sampling in general, see the
work of Cochran (1977) or Scheaffer et al. (2011).
/
EXAMPLE 2.6 Stratified Sampling of Strip Transects
Consider again the situation discussed concerning the winter mortality
of deer as estimated by counting dead deer in a sampled strip transect
1 km long and 60 m wide. Such a study might well include stratification
based on habitat, on the assumption that dead deer are more likely to be
found in some types of habitat than in others.
Suppose that this is the case and that a study area is divided into three
habitat types. In type I habitat, there are N 1 = 20,000 potential strip tran-
sects, of which n 1 = 20 are randomly chosen for sampling. In type II habi-
tat, there are N 2 = 15,000 strip transects, of which n 2 = 15 are randomly
chosen for a sampling. Finally, in type III habitat there are N 3 = 15,000
strip transects, of which n 3 = 15 are randomly chosen for sampling.
Suppose further that the number of dead deer found is as shown in
Table  2.2. Note also the values of strata sample sizes n i , sample means
y i , sample standard deviations s i , estimated variances of sample means
y
· , and stratum sizes N i , which are shown in the table.
Using Equation (2.23) and the information in Table  2.2, the estimated
mean number of dead deer per transect for the whole population is
y s = (20,000/50,000) × 0.90 + (15,000/50,000) × 0.27 + (15,000/50,000) ×
0.33 = 0.540, with estimated variance
Var(
)
· = (20,000/50,000) 2 × 0.036 +
(15,000/50,000) 2 × 0.013 + (15,000/50000) 2 × 0.025 = 0.00934. The estimated
standard error is then
Var(
y
)
·
, and an approximate
95% confidence interval for the population mean number of deer per
transect is 0.540 ± 1.96 × 0.097 or 0.35 to 0.73. It follows that an esti-
mate of the total number of dead deer in the entire study region is
SE() 0.00934
y
=
=
0.097
s
Search WWH ::




Custom Search