Information Technology Reference
In-Depth Information
Table 6.3. p-values for Tukey's HSD post-hoc comparison of the different mixing
methods. The performance values were gathered in 50 experimental runs per function,
using both averaging classifiers and classifiers that model straight lines. The p-values
reported are for a post-doc comparison only considering the factor that determines
the mixing method. The methods are ordered by performance, with the leftmost and
bottom method being the best-performing one. The p-values in italics indicate that no
significant difference between the methods at the 0 . 01 level was detected.
IRLS
IRLSf InvVar MaxConf
Conf
LSf
LS
XCS
XCS
0.0000 0.0000 0.0000
0.0000
0.0000 0.0283 0.5131
-
LS
0.0000 0.0000 0.0000
0.0000
0.0000 0.8574
-
LSf
0.0000 0.0000 0.0000
0.0095 0.0150
-
Conf
0.0000 0.0000 0.1044 0.9999
-
MaxConf 0.0000 0.0000 0.1445
-
InvVar
0.0001 0.0002
-
IRLSf
0.8657
-
IRLS
-
p-values of Tukey's HSD post-hoc test are given in Table 6.3. They show that
the performance difference between all methods is significant at the 0.01 level,
except for the ones that are written in italics.
The same experiment where preformed with K
, classifiers,
yielding qualitatively similar results. This shows that the presented performance
differences are not sensitive to the number of classifiers used.
∈{
20 , 100 , 400
}
6.3.3
Discussion
As can be seen from the results, IRLS is in almost all cases significantly better,
and in no case significantly worse than any other methods that were applied.
IRLSf uses more information than IRLS to mix the classifier predictions, and
thus can be expected to perform better. As can be seen from Table 6.2, however,
it frequently features worse performance, though not significantly. This worse
performance can be attributed to the used stopping criterion that is based on
the relative change of the likelihood between two successive iterations. This
likelihood increases more slowly when using IRLSf, which leads the stopping
criterion to abort learning earlier for IRLSf than IRLS, causing it to perform
worse.
InvVar is the best method of the introduced heuristics and constantly out-
performs LS and LSf. Even though it does not perform significantly better than
Conf and MaxConf, its mean is higher and the method relies on less assumpti-
ons. Thus, it should be the preferred method amongst the heuristics that were
introduced.
As expected, XCS features a worse performance than all other methods, which
can be attribute to the fact that the performance measure of the local model
is influenced by the performance of the local models that match the same in-
puts. This might introduce some smoothing, but it remains questionable if such
smoothing is ever advantageous. This doubt is justified by observing that XCS
Search WWH ::




Custom Search