Agriculture Reference
In-Depth Information
7.8 Empirical Exercises
To examine the performance of the previously discussed designs, we have used two
types of populations 2 : artificially generated with known and specific attributes, and
real data from a sample survey that was considered to be a population that may
occur in a two-phase sampling design. It is important to point out that we are mainly
interested in the design-based properties of each design. Therefore, even when
dealing with artificial populations, these are fixed across the simulations. In other
words, the randomization is only due to replications of the selection algorithm.
7.8.1 Simulated Populations
To better understand the effects of different designs on the efficiency of the
distributions of the coordinates C, and the spatial features of the target variable y,
we generated three different point processes of fixed size N ¼1,000 with different
levels of clustering. For each of them, we considered nine possible values for y,
according to the outcomes of a Gaussian stochastic process with three different
spatial trends and three intensities of a spatial dependence parameter.
To verify if the efficiency varies with the sampling rate, we selected 10,000
samples of size n ¼10, 50, and 100 from each of the 27 populations.
The bi-dimensional coordinates x 1 and x 2 were generated in the range [0,1],
using a simulated realization of a particular random point pattern: the Neyman-
Scott process with Cauchy cluster kernel (Waagepetersen 2007 ). The intensity of
the cluster centers of the Poisson process was set equal to 10, and the mean of the
per-cluster number of units was 100. Finally, we used three different scale param-
eters for the cluster kernel (i.e., 0.005, 0.01, and 0.03), which respectively represent
a highly clustered, clustered, and sparse population of spatial units (see Fig. 7.8 ).
For each of these geo-referenced populations, we simulated several stationary
spatial Gaussian random fields, y, (Lantuejoul 2002 ) with no spatial trend, a linear
trend x 1 + x 2 +
, and a quadratic trend (x 1 0.5) 2 +(x 2 0.5) 2 +
. These trends
explain approximately 80 % of the variance of the generated population variable y.
Conversely, we used an exponential covariance function with dependence param-
eters
ʵ
ʵ
, which respectively represents low,
medium, and high homogeneity of the data of close units. To avoid the possible
effects caused by different variabilities, each population was standardized to have
the same mean (
ˁ ¼ (0.001, 0.01, 0.1) for the errors
ʵ
˃ y ¼1).
We compared the different designs using the MSEs of the 10,000 HT estimates
of the population mean to the same error obtained when using an SRS design. This
scale factor was used to remove the effects of the sizes of the population N and of
the sample n on the sampling errors. It is worth noting that, in every simulation, the
ʼ y ¼5) and standard deviation (
2 The empirical results presented here are partly based on Benedetti et al. ( 2015 ).
Search WWH ::




Custom Search