Agriculture Reference
In-Depth Information
As previously demonstrated, there are two groups of statistical methodologies
for sampling. The first has been introduced in this chapter, and it is usually referred
to as the model-based approach.
The second is the design-based approach, which has been widely discussed and
used in this topic. Under this methodological framework and with complete
response, the inference is based on the conditional density f (i U |y U , X U ), where the
population values of both the survey and auxiliary variables are treated as fixed, and
the only sources of randomness are the random variables that characterize the
selection processes. Obviously, it is not possible to postulate any model for the
population distribution of y U in design-based analysis. As a consequence, we
cannot say anything about the parameter (
ʸ ) that specifies the population.
Here, we are interested in finite population parameters that are well-defined
functions of the values y U . Because y U is theoretically observable, if a census
survey is performed, the finite population parameters can be calculated on the
entire population and referred to as census parameters. In practice, the purpose of
design-based analysis is to test the values of the finite population parameters that
identify the population distribution, using repeated sampling distributions of esti-
mates of these parameters. It is evident that design-based analysis is only possible
when a probability selection method has been used.
Although a design-based approach does not assume any distributional hypoth-
esis for y U , the finite population parameters of interest and the estimator can be
justified by assumptions about the distribution. This is the model-assisted frame-
work (S¨rndal et al. 1992 ), and is the basis for the pseudolikelihood approach to
survey data analysis.
Let f (y U ,
) denote the density of the population y U . We consider that this density
is known. Given the values y U , the maximum likelihood estimate of
ʸ
(defined as ʸ U )
ʸ
¼
is the
is obtained by solving ʸ
log f y U ; ʸ
0, where sc
ðÞ¼ ʸ
log f y U ; ʸ
score function. Obviously, sc ʸ U ¼
, sc ʸ U defines a finite
population parameter. As a consequence, ʸ U is also a finite population parameter.
The pseudo-likelihood approach constructs a design consistent estimate of the
score function sc (
0. For any value of
ʸ
ʸ
), sets this estimate equal to zero, and solves the resulting
equation to find the pseudolikelihood estimate of
ʸ
ʸ
, sc w (
ʸ
. For a fixed
) represents
a design consistent estimate of sc (
ʸ
) based on the observed data. A maximum
ʸ PL ¼
ʸ PL of
pseudolikelihood estimator
0. Note that the
procedure does not ensure that the pseudolikelihood estimator is unique.
To clarify this technique, we present a simple example where the population
units are considered to be independently distributed, and the design-based estimate
is obtained using an expanded estimator (see Sect. 1.2 ).
Assume that we have a complete response, and that f y U ; ʸ
ʸ
is such that sc w
Y k2U fy k ; ʸ
¼
ð
Þ
,
where
f ( y k )
is
the density of
the
k -th population unit.
In this case,
X U
X U u k , where u k ¼ ʸ
sc
ðÞ¼
log f y k ; ʸ
ð
ð
Þ
Þ
¼
log f yðÞ
ð
Þ
. Given the
ʸ
sample values, following the same logic of the HT estimator, the expansion
Search WWH ::




Custom Search