Graphics Reference
In-Depth Information
the problems met in the experimental study. For example, in a multiple problem
comparison, each block corresponds to the results offered for a specific problem.
When referring to multiple comparisons tests, a block is composed of three or more
subjects or results, each one corresponding to the performance evaluation of the
algorithm for the problem.
Please remember that in pairwise analysis, if we try to extract a conclusion involv-
ingmore than one pairwise comparisons, we will obtain an accumulated error coming
from its combination losing the control on the Family-Wise Error Rate (FWER) (see
Eq. 2.2 ).
This section is devoted to describing the use of several procedures for multiple
comparisons considering a control method. In this sense, a control method can be
defined as the most interesting algorithm for the researcher of the experimental study
(usually its new proposal). Therefore, its performance will be contrasted against the
rest of the algorithms of the study.
The best-known procedure for testing the differences between more than two
related samples, the Friedman test, will be introduced in the following.
Friedman test
The Friedman test [ 7 , 8 ] (Friedman two-way analysis of variances by ranks) is a
nonparametric analog of the parametric two-way analysis of variance. It can be used
to answer the following question: in a set of k samples (where k
2),doatleasttwo
of the samples represent populations with different median values?. The Friedman
test is the analog of the repeated measures ANOVA in non-parametric statistical
procedures; therefore, it is a multiple comparisons test that aims to detect significant
differences between the behavior of two or more algorithms.
The null hypothesis for Friedman's test states equality of medians between the
populations. The alternative hypothesis is defined as the negation of the null hypoth-
esis, so it is nondirectional.
The Friedman test method is described as follows: It ranks the algorithms for each
data set separately, the best performing algorithm getting the rank of 1, the second
best rank 2, and so on. In case of ties average ranks are assigned. Let r i be the rank
of the j th of k algorithms on the i th of N data sets. The Friedman test compares
the average ranks of algorithms, R j
N i r i . Under the null hypothesis, which
states that all the algorithms are equivalent and so their ranks R j should be equal,
the Friedman statistic:
1
=
jR j
2
12 N
k
(
k
+
1
)
2
F
χ
=
(2.3)
(
+
)
k
k
1
4
2 distributionwith k
which is distributed according to a
χ
1 degrees of freedom,when
n and k are big enough(as a rule of a thumb, n
>
10 and k
>
5).
 
Search WWH ::




Custom Search