Information Technology Reference
In-Depth Information
the two treatment groups. Under the null hypothesis, this will not affect
the results; under an alternative hypothesis, the two bootstrap sample
means will be closer together than they would if drawn separately from the
two populations. The difference in means between the two samples that
were drawn originally should stand out as an extreme value.
Hall and Wilson [1991] also recommend that the bootstrap be applied
only to statistics that, for very large samples, will have distributions that do
not depend on any unknowns. 5 In the present example, Hall and Wilson
[1991] recommend the use of the t statistic, rather than the simple differ-
ence of means, as leading to a test that is both closer to exact and more
powerful.
Suppose we draw several hundred such bootstrap samples with replace-
ment from the combined sample and compute the t statistic each time. We
would then compare the original value of the test statistic, Student's t in
this example, with the resulting bootstrap distribution to determine what
decision to make.
Pairwise Dependence. If the covariances are the same for each pair of
observations, then the permutation test described previously is an exact
test if the observations are normally distributed (Lehmann, 1986) and is
almost exact otherwise.
Even if the covariances are not equal, if the covariance matrix is non-
singular, we may use the inverse of this covariance matrix to transform the
original (dependent) variables to independent (and hence exchangeable)
variables. After this transformation, the assumptions are satisfied so that a
permutation test can be applied. This result holds even if the variables are
collinear. Let R denote the rank of the covariance matrix in the singular
case. Then there exists a projection onto an R -dimensional subspace
where R normal random variables are independent. So if we have an
N dimensional ( N > R ) correlated and singular multivariate normal
distribution, there exists a set of R linear combinations of the original N
variables so that the R linear combinations are each univariate normal and
independent.
The preceding is only of theoretical interest unless we have some inde-
pendent source from which to obtain an estimate of the covariance matrix.
If we use the data at hand to estimate the covariances, the estimates will
be interdependent and so will the transformed observations.
Moving Average or Autoregressive Process. These cases are best
treated by the same methods and are subject to the caveats as described in
Part 3 of this text.
5
Such statistics are termed asymptotically pivotal.
Search WWH ::




Custom Search