Biology Reference
In-Depth Information
individual specimens. The next step is to construct a second data set using Monte Carlo
simulations. That is done by estimating the mean and range of each variable; from the
data, N
1 simulated specimens are generated with values randomly drawn from the
observed range. Monte Carlo simulations are similar to bootstraps in that they simulate
data based on a given null model and an observed set of data, but they differ in that boot-
strapping is carried out using a non-parametric resampling procedure whereas Monte
Carlo simulations are based on a distributional model. The distribution of the original
data set is parameterized, and those parameters are used to generate a simulated dataset
having the distribution of the observations. Given the simulated data, a second nearest-
neighbor distance, R i , is computed between each observed specimen and the one closest to
it in the Monte Carlo set (note that R i is not a nearest-neighbor distance between Monte
Carlo specimens, but rather the distance between an observed specimen and the nearest
Monte Carlo simulated specimen ).
Foote provides a measure that allows us to compare the fit of the simulated distances to
the observed ones, the proportional distance P i for the i th specimen. This is a ratio whose
numerator is the difference between the two distances ( D i , the observed nearest-neighbor
distance, and R i , the Monte Carlo nearest-neighbor distance) and whose denominator is
the Monte Carlo nearest-neighbor difference:
2
D i 2
R i
P i 5
(10.6)
R i
If the random model fits the data, we would expect that, on average, D i would equal R i ,
and hence the mean Pi over all specimens ( P mean ) is zero. When P mean is less than zero the
observed specimens are more clustered than expected by chance; conversely, if P mean is
greater than zero they are further apart than expected by chance. To determine whether
zero lies within the confidence interval, we estimate the range of P mean by running the
Monte Carlo simulation many times.
To generate a Monte Carlo set under a multivariate normal (Gaussian) model, we
must estimate the mean and standard deviation of each variable; to generate a Monte
Carlo set under a uniform distribution model, we must estimate the upper and lower
bounds of the range for each variable. It can be difficult to estimate the range accurately
when sample sizes are small because, at small sample sizes, the observed minimum and
maximum will underestimate the “true” range. Thus, rather than using the observed mini-
mum and maximum values to estimate the range, Foote uses estimators developed by
Strauss and Sadler (1989) for the “true” minimum ( Y ) and the “true” maximum ( Z )ofa
distribution:
NA
B
2
Y
(10.7)
5
N
1
2
NB
A
2
Z
(10.8)
5
N
2
1
where A is the lowest observed value and B is the highest observed value in N specimens.
Rather than use the observed minimum and maximum values, Foote determines the mean
Search WWH ::




Custom Search