Agriculture Reference
In-Depth Information
It is interesting to note that to analyze the efficiency of Eq. ( 6.17 ) with respect to
the SRS, we argue using an opposite position to that used for the stratified sampling
in Eq. ( 6.13 ). In terms of variance decomposition, cluster sampling is affected by
the variance between groups and not within the stratified groups. Thus, its effi-
ciency depends on which of these two components of the variance of the variable of
interest has the greater weight. In general, but not necessarily, we assume that the
groups are planned to be homogeneous; there should be a very high variance
between the groups, and cluster sampling should be less efficient than both the
stratified design and SRS.
Its efficiency with respect
to the SRS design is measured using (S¨rndal
et al. 1992 , p. 132)
;t HT
deff Clus
ð
Þ ¼
þ
N
ʴ;
ð
:
Þ
1
1
6
18
where N is the average size of the groups, and
is an homogeneity coefficient (see
S¨rndal et al. 1992 , p. 130). Unless ʴ is negative (meaning there is a large within-
group variation), this design is less efficient than SRS.
If we perform an SRS in both the selection stages, Eq. ( 6.14 ) reduces to
ʴ
X i2s 1 X k2s i d k y k ¼
X i2s 1 X k2s i
N 1
n 1
N i
n i y k ;
t HT , 2 STSRS ¼
ð
6
:
19
Þ
where n i is the sample size within group i , and d k ¼
( N 1 / n 1 )
( N i / n i ) are the direct
sampling weights . Similarly Eq. ( 6.15 ) reduces to
n 1 X i2s 1 N i
f 1
N 1
f i
1
1
V HT t HT , 2 STSRS
N 1
S 2
S y , i ;
ð
Þ ¼
t , s 1 þ
ð
6
:
20
Þ
n 1
n i
where S y;i
is the variance of the variable y within group i . Note that S 2
t , s 1 is different
from S t , s 1 because in the first case the cluster totals are estimated, while in the
second case they are known since each group is censused.
The mstage function in the sampling package selects multistage random
samples. In the following example, we have used it to select (without replacement)
n 1 ¼
10 grid over the study
region. For a stratified design, the outcome of this function should be managed
using the getdata utility function, which extracts the sample data from the
population frame. The HT estimates are obtained using the standard sequence;
the id ¼ ~strataid2+id option is used in the svydesign function to identify
the codes defining the sampling stages, and the fpc ¼ ~prob1+prob2 option now
requires two probability vectors, one for each random selection stage. Note that
when executing this code, one must launch all the previous codes of this chapter,
because framepop needs all of the previously generated variables. The selected
sample is mapped in Fig. 6.6 . Note that the deff shows that the two-stage plan can be
inferior to SRS.
10 of the N 1 ¼
100 groups obtained by overlaying a 10
Search WWH ::




Custom Search