Agriculture Reference
In-Depth Information
expenditure on capital equipment, the total salary costs, or the total production of a
field. Alternatively, practitioners may use the mean per unit as a descriptive
statistic, which is often a total divided by an estimate of the number of units that
contribute to the total.
A superpopulation model (see Sect.
1.3
) can be used to formalize the relation-
ship between a target variable y and auxiliary data X. For example, in a survey of
farms, the yield of a crop in a particular period may be related to the geographical
coordinates of the field, to the elevation (obtained through a digital elevation
model), and to the terrain. The main assumption is that the quantities of interest
are modeled as realizations of random variables with a particular joint probability
distribution. For example, in this case, the model can be defined as
Y
k
¼ ʲ
0
þ ʲ
1
x
1
k
þ ʲ
2
x
2
k
þ ʲ
3
x
3
k
þ ʵ
k
k
¼
...
,
N
;
ð
:
Þ
1,
12
1
where
Y
k
1
is the yield of a crop,
x
1
k
,
x
2
k
,
x
3
k
, are the covariates, and the
ʵ
k
s are
uncorrelated random errors with mean 0 and variance
2
x
k
. This is a simple
specification, but more complicated models can be used. In fact, we can add
different or additional covariates to the model, or use a non-linear relationship
between the variables.
Now, consider the population vector y
σ
t
that is treated as the
¼
ð
y
1
y
2
...
y
N
Þ
t
, and the general linear
realization of a random vector Y
¼
ð
Y
1
Y
2
...
Y
N
Þ
model
ξ
E
ξ
Y
ðÞ¼
β
X
ð
12
:
2
Þ
Var
ξ
Y
ðÞ¼
;
V
where X is an
N
q
matrix of covariates,
β
is a
q
1 vector of unknown param-
eters, and V is a positive definite covariance matrix.
Under Model (
12.2
), we can define the population total estimate, and derive the
best linear unbiased predictor (BLUP) estimator (Valliant
2009
).
Generally speaking, our objective is to estimate a linear combination of y,
namely
t
is a vector of constants of size
N
.If
we want to estimate the population total, then
t
y, where
ʳ
ʳ¼ ʳ
1
ð
ʳ
2
... ʳ
N
Þ
ʳ
k
¼
1. Conversely, if we want to
1/
N
.
We select a sample
s
of size
n
from the population of
N
units
,
and observe the
y
values of the sample units. The non-sample units are denoted as
s
. Without loss of
generality, for any sample
s
, we can arrange the population vector y so that the first
n
units are in the sample, and the last
N
ʳ
k
¼
estimate the population mean, then
n
are not in the sample. In this way, we
t
, where y
s
is the vector of the observed values of
can redefine the vector y
¼
y
s
;
y
s
the sampled
n
units, and y
s
is the vector of the unobserved values of the
1
To avoid confusion, note that in this section the uppercase Y indicates a random vector, while the
lowercase y describes the realization of Y.
Search WWH ::
Custom Search