Testing Hypotheses: Choosing a Test Statistic - Common Errors in Statistics

Information Technology Reference

In-Depth Information

the two treatment groups. Under the null hypothesis, this will not affect

the results; under an alternative hypothesis, the two bootstrap sample

means will be closer together than they would if drawn separately from the

two populations. The difference in means between the two samples that

were drawn originally should stand out as an extreme value.

Hall and Wilson [1991] also recommend that the bootstrap be applied

only to statistics that, for very large samples, will have distributions that do

not depend on any unknowns. 5 In the present example, Hall and Wilson

[1991] recommend the use of the t statistic, rather than the simple differ-

ence of means, as leading to a test that is both closer to exact and more

powerful.

Suppose we draw several hundred such bootstrap samples with replace-

ment from the combined sample and compute the t statistic each time. We

would then compare the original value of the test statistic, Student's t in

this example, with the resulting bootstrap distribution to determine what

decision to make.

Pairwise Dependence. If the covariances are the same for each pair of

observations, then the permutation test described previously is an exact

test if the observations are normally distributed (Lehmann, 1986) and is

almost exact otherwise.

Even if the covariances are not equal, if the covariance matrix is non-

singular, we may use the inverse of this covariance matrix to transform the

original (dependent) variables to independent (and hence exchangeable)

variables. After this transformation, the assumptions are satisfied so that a

permutation test can be applied. This result holds even if the variables are

collinear. Let R denote the rank of the covariance matrix in the singular

case. Then there exists a projection onto an R -dimensional subspace

where R normal random variables are independent. So if we have an

N dimensional ( N > R ) correlated and singular multivariate normal

distribution, there exists a set of R linear combinations of the original N

variables so that the R linear combinations are each univariate normal and

independent.

The preceding is only of theoretical interest unless we have some inde-

pendent source from which to obtain an estimate of the covariance matrix.

If we use the data at hand to estimate the covariances, the estimates will

be interdependent and so will the transformed observations.

Moving Average or Autoregressive Process. These cases are best

treated by the same methods and are subject to the caveats as described in

Part 3 of this text.

5

Such statistics are termed asymptotically pivotal.

Search WWH ::

Custom Search

Home