Agriculture Reference
In-Depth Information
and Tibshirani 1993 ), while in finite population inference it is called replicate
weights (Wolter 2007 ).
These methods are all based on a similar underlying logic, and differ according
to the scheme used to create replicates from the sample. The estimate of interest is
calculated from the sample and from each replicate . Finally, the difference between
the estimate from the sample and each of the replicates is used to determine the
variance. Different methods of creating the subsamples yield different types of
replicate weights .
The kind of replicate weights that we chose may be influenced by the type of
sampling design that was used to collect the data. In particular, it is very important
to know if stratification was used, and how many units were in each strata.
According to resampling theory, the selection scheme for the m subsamples
should be with replacement to ensure that the estimates are independent. This
requirement is a strong limit to the criterion that we can use for a finite population.
Non-independent samples within a resampling framework (not the resampling
scheme) introduce a bias in the variance estimates, even when considering linear
estimators. However, empirical studies have concluded that the bias is negligible.
When choosing the number of replicates m , we must consider that the stability of
the variance estimator improves as m increases. The bias of the variance estimator
in the nonlinear case increases with m , but it decreases as the size of the replicates
increases. Thus, the value of m should be not huge, but should be sufficiently high to
meet the stability requirements.
One way of selecting a replicate is to use the technique of balanced half-
samples . Assume that
there is a very fine partition of the sampled units in
H strata, such that n h ¼
2 for each h .A half -sample is a set of units consisting of
exactly one unit per stratum, yielding 2 H possible splits. In other words, only one
element of the two elements from each stratum h is selected. This could lead to a
very large amount of subsets. Therefore, we cannot use this hypothesis to build a list
of subsets. We must select a subset from all the groups, calculate the estimates for
each half -sample, and then use them to estimate the variance of the parameter of
interest. For each subsample a
¼1,2, ... ,, 2 H , consider two generic elements h 1 and
h 2. We can then define the two indicator variables
1
if the a -th half-sample contains the unit h 1
ʴ ah ¼
;
ð
10
:
29
Þ
0
if the a -th half-sample contains the unit h 2
and
1
if the a -th half-sample contains the unit h 1
ʵ ah ¼
:
ð
10
:
30
Þ
1
if the a -th half-sample contains the unit h 2
A set of m half -samples is said to be balanced if
Search WWH ::




Custom Search