Information Technology Reference
In-Depth Information
Wilcoxon test; the use of the ranks in the combined sample
reduces the impact (though not the entire effect) of the difference
in variability between the two samples.
Generalized Wilcoxon test (see O'Brien [1988]).
Procedure described in Manly and Francis [1999].
Procedure described in Chapter 7 of Weerahandi [1995].
Procedure described in Chapter 10 of Pesarin [2001].
Bootstrap. See the section on dependent observations in what
follows.
Permutation test. Phillip Good conducted simulations for sample
sizes between 6 and 12 drawn from normally distributed popula-
tions. The populations in these simulations had variances that dif-
fered by up to a factor of five, and nominal p values of 5% were
accurate to within 1.5%.
Hilton [1996] compared the power of the Wilcoxon test, O'Brien's
test, and the Smirnov test in the presence of both location shift and scale
(variance) alternatives. As the relative influence of the difference in vari-
ances grows, the O'Brien test is most powerful. The Wilcoxon test loses
power in the face of different variances. If the variance ratio is 4 : 1, the
Wilcoxon test is not trustworthy.
One point is unequivocal. William Anderson writes, “The first issue is to
understand why the variances are so different, and what does this mean to
the patient. It may well be the case that a new treatment is not appropri-
ate because of higher variance, even if the difference in means is favorable.
This issue is important whether or not the difference was anticipated.
Even if the regulatory agency does not raise the issue, I want to do so
internally.”
David Salsburg agrees. “If patients have been assigned at random to the
various treatment groups, the existence of a significant difference in any
parameter of the distribution suggests that there is a difference in treat-
ment effect. The problem is not how to compare the means but how to
determine what aspect of this difference is relevant to the purpose of the
study.
“Since the variances are significantly different, I can think of two situa-
tions where this might occur:
1. In many measurements there are minimum and maximum values
that are possible, e.g. the Hamilton Depression Scale, or the
number of painful joints in arthritis. If one of the treatments is
very effective, it will tend to push values into one of the extremes.
This will produce a change in distribution from a relatively
symmetric one to a skewed one, with a corresponding change in
variance.
2. The experimental subjects may represent a mixture of
populations. The difference in variance may occur because the
Search WWH ::




Custom Search