Information Technology Reference
In-Depth Information
the chi-square distribution for use with contingency tables also can be off
by an order of magnitude.
The good news is that there exists a class of tests, the permutation tests
described in Chapter 5, for which the significance levels are exact if the
observations are independent and identically distributed under the null
hypothesis or their labels are otherwise exchangeable.
CONFIDENCE INTERVALS
If p values are misleading, what are we to use in their place? Jones [1955,
p. 407] was among the first to suggest that “an investigator would be
misled less frequently and would be more likely to obtain the information
he seeks were he to formulate his experimental problems in terms of the
estimation of population parameters, with the establishment of confidence
intervals about the estimated values, rather than in terms of a null hypoth-
esis against all possible alternatives.” See also Gardner and Altman [1996]
and Poole [2001].
Confidence intervals can be derived from the rejection regions of our
hypothesis tests, whether the latter are based on parametric or nonpara-
metric methods. Suppose A (q¢) is a 1 - a level acceptance region for
testing the hypothesis q = q¢; that is, we accept the hypothesis if our test
statistic T belongs to the acceptance region A (q¢) and reject it otherwise.
Let S ( X ) consist of all the parameter values q* for which T [ X ] belongs to
the acceptance region A (q*). Then S ( X ) is a 1 - a level confidence inter-
val for q based on the set of observations X = { x 1 , x 2 ,..., x n }.
The probability that S ( X ) includes q 0 when q = q 0 is equal to the proba-
bility that T ( X ) belongs to the acceptance region of q 0 and is greater than
or equal to a.
As our confidence 1 - a increases, from 90% to 95%, for example, the
width of the resulting confidence interval increases. Thus, a 95% confi-
dence interval is wider than a 90% confidence interval.
By the same process, the rejection regions of our hypothesis tests can be
derived from confidence intervals. Suppose our hypothesis is that the odds
ratio for a 2 ¥ 2 contingency table is 1. Then we would accept this null
hypothesis if and only if our confidence interval for the odds ratio includes
the value 1.
A common error is to misinterpret the confidence interval as a state-
ment about the unknown parameter. It is not true that the probability
that a parameter is included in a 95% confidence interval is 95%. What is
true is that if we derive a large number of 95% confidence intervals, we
can expect the true value of the parameter to be included in the computed
intervals 95% of the time. (That is, the true values will be included if the
assumptions on which the tests and confidence intervals are based are sat-
isfied 100% of the time.) Like the p value, the upper and lower confidence
Search WWH ::




Custom Search