Biology Reference
In-Depth Information
assignment of specimens into groups. To form the Monte Carlo set, we will assume that
the single underlying distribution is normal. We then estimate the mean and standard
deviation of this underlying distribution by merging the data sets into a single group. The
mean of the single distribution is 3.67 and the standard deviation is 2.1. To determine the
significance of the observed difference in the means of the two groups, we generate a
series of paired Monte Carlo sets, one with a sample size N X 5
31, one with a sample size
N Y 5
18, and we determine the difference between the two means. We then determine the
proportion of N Monte Carlo sets in which the difference between the means of the paired
Monte Carlo sets exceeds that observed between the means of the original data sets.
For the sets
above, the Monte Carlo sets were generated under the assumption
that both samples were drawn from the same normal distribution, with a mean of 3.67
and a standard deviation of 2.1 (the mean and standard deviation of the combined data
sets). In 480 of 1000 pairs of Monte Carlo sets (48%), the difference between the means of
the paired Monte Carlo sets exceeds the observed difference between the means of the
original data sets, thus the null hypothesis of a single underlying normal distribution can-
not be rejected. It should be noted that the combined data set (of all specimens in
X
and
Y
)
is probably not normally distributed, so we might want to repeat the Monte Carlo test
using other models of the underlying distribution.
Monte Carlo simulations are particularly useful for testing different hypothetical situa-
tions when the underlying distributions are believed to be well known. Monte Carlo meth-
ods can be used in cases when bootstrap methods cannot, such as to estimate the effect of
increasing the sample size on the estimated variance; Monte Carlo simulations are not lim-
ited by the observed sample sizes (as bootstrap methods are).
X
and
Y
Example: Resampling Tests and Regression Models
To this point, we have focused on t-tests, but computer-based methods are useful for a
wide variety of tests. To develop a more general understanding of these methods, we now
show how bootstrap and permutation methods can be used in regression analysis. As pre-
sented earlier in this chapter, the model for a regression, given N observations for pair of
measurements (X i , Y i ):
Y i
A
BX i
1 ε
(8A.35)
5
1
i
The slope, B, is given by:
s XY
s XX
B
(8A.36)
5
The intercept, A, is given by:
A
5,
Y
.2
B
,
X
.
(8A.37)
where
X
and
Y
are the expected values (means) of the X i and Y i values, and
,
.
,
.
X
N
2
s XY
5
1 ð X i
2,
X
. Þ
(8A.38)
i
5
Search WWH ::




Custom Search