Agriculture Reference
In-Depth Information
It is interesting to note that to analyze the efficiency of Eq. (
6.17
) with respect to
the SRS, we argue using an opposite position to that used for the stratified sampling
in Eq. (
6.13
). In terms of variance decomposition,
cluster sampling
is affected by
the variance between groups and not within the stratified groups. Thus, its effi-
ciency depends on which of these two components of the variance of the variable of
interest has the greater weight. In general, but not necessarily, we assume that the
groups are planned to be homogeneous; there should be a very high variance
between the groups, and cluster sampling should be less efficient than both the
stratified design and SRS.
Its efficiency with respect
to the SRS design is measured using (S¨rndal
et al.
1992
, p. 132)
;t
HT
deff Clus
ð
Þ
¼
þ
N
ʴ;
ð
:
Þ
1
1
6
18
where
N
is the average size of the groups, and
is an homogeneity coefficient (see
S¨rndal et al.
1992
, p. 130). Unless
ʴ
is negative (meaning there is a large within-
group variation), this design is less efficient than SRS.
If we perform an SRS in both the selection stages, Eq. (
6.14
) reduces to
ʴ
X
i2s
1
X
k2s
i
d
k
y
k
¼
X
i2s
1
X
k2s
i
N
1
n
1
N
i
n
i
y
k
;
t
HT
, 2
STSRS
¼
ð
6
:
19
Þ
where
n
i
is the sample size within group
i
, and
d
k
¼
(
N
1
/
n
1
)
(
N
i
/
n
i
) are the
direct
sampling weights
. Similarly Eq. (
6.15
) reduces to
n
1
X
i2s
1
N
i
f
1
N
1
f
i
1
1
V
HT
t
HT
, 2
STSRS
N
1
S
2
S
y
,
i
;
ð
Þ
¼
t
,
s
1
þ
ð
6
:
20
Þ
n
1
n
i
where
S
y;i
is the variance of the variable y within group
i
. Note that
S
2
t
,
s
1
is different
from
S
t
,
s
1
because in the first case the cluster totals are estimated, while in the
second case they are known since each group is censused.
The
mstage
function in the
sampling
package selects multistage random
samples. In the following example, we have used it to select (without replacement)
n
1
¼
10 grid over the study
region. For a stratified design, the outcome of this function should be managed
using the
getdata
utility function, which extracts the sample data from the
population frame. The HT estimates are obtained using the standard sequence;
the
id
¼
~strataid2+id
option is used in the
svydesign
function to identify
the codes defining the sampling stages, and the
fpc
¼
~prob1+prob2
option now
requires two probability vectors, one for each random selection stage. Note that
when executing this code, one must launch all the previous codes of this chapter,
because
framepop
needs all of the previously generated variables. The selected
sample is mapped in Fig.
6.6
. Note that the
deff
shows that the two-stage plan can be
inferior to SRS.
10 of the
N
1
¼
100 groups obtained by overlaying a 10
Search WWH ::
Custom Search