Agriculture Reference
In-Depth Information
Relationships are usually described using a statistical model, and the outputs
consist of estimates and inferences on the parameters. Many of these methods use a
maximum likelihood approach.
Unfortunately, practitioners apply infinite population methods, and neglect the
particular characteristics of sample data. The survey analysis must account for
survey data from units selected using complex sample designs. Weights must be
used when analyzing survey data, and the variances of survey estimates must be
computed in a manner that reflects the complex sample design.
In this section, we discuss inferential problems in sample surveys, considering
estimates of the parameters of the process that is assumed to have generated the
values of the surveyed finite population. Likelihood theory provides the theoretical
framework that specifies criteria for selecting and evaluating particular inferences
using this data.
Maximum likelihood is a widely used method for point and interval estimation.
Here, our purpose is to develop a general theory of maximum likelihood
estimation for sample survey data analysis. Our discussion follows Chambers and
Skinner ( 2003 ) and Chambers et al. ( 2012 ), to whom the reader can refer to for
greater details. Furthermore, note that this theory assumes the standard regularity
conditions for likelihood analysis (see, for example, Serfling 1980 , Sect. 4.2).
Let y denote a survey variable of interest, which represents a realization of Y.
The values of this variable can be theoretically observed for each of the N units of
the surveyed population U , say y U . We assume that y is generated from a distribu-
tion f (y U ;
. 2 Obviously, we can effec-
tively use the classical approach to maximum likelihood inference if y U is
completely observed for the entire population U .
In this approach,
ʸ
), which is known except for a parameter
ʸ
the parameter
is defined with respect
to a specified
superpopulation model f (y U ;
), which corroborates a link with the predictive
approach described in Sect. 12.2 .
Unfortunately, y is not entirely observed in the sample analysis. Instead, we
survey a sample s of size n . If we have a complete response, the vector y s
corresponds to the n observed values of the target variable. Our aim is to use the
data observed in the survey sample (y s ) to estimate ʸ using a maximum likelihood
approach.
The likelihood is always the density of the observed data. If we assume that we
have a complete response, the likelihood becomes the density of the sampled data.
To apply maximum likelihood, we must know the distribution of y s that depends on
the distribution of y U , which depends on
ʸ
and how we select the survey sample.
Chambers et al. ( 2012 ) assumed that y s is generated in two steps. In the first step,
y U is realized, but not observed. In the second, a subset s of U is selected, and y s is
observed. There is a very large variety of sample selection methods in common use,
and the reader can refer to Chaps. 6 and 7 of this topic for more details.
ʸ
2 For the sake of simplicity, we have supposed that the function f (.) depends on only one parameter
ʸ
. These methods can be straightforwardly extended to the multivariate case.
Search WWH ::




Custom Search