Mixing Independently Trained Classifiers - Design and Analysis of Learning Classifier Systems

Information Technology Reference

In-Depth Information

Table 6.3. p-values for Tukey's HSD post-hoc comparison of the different mixing

methods. The performance values were gathered in 50 experimental runs per function,

using both averaging classifiers and classifiers that model straight lines. The p-values

reported are for a post-doc comparison only considering the factor that determines

the mixing method. The methods are ordered by performance, with the leftmost and

bottom method being the best-performing one. The p-values in italics indicate that no

significant difference between the methods at the 0 . 01 level was detected.

IRLS

IRLSf InvVar MaxConf

Conf

LSf

LS

XCS

0.0000 0.0000 0.0000

0.0000

0.0000 0.0283 0.5131

-

LS

0.0000 0.0000 0.0000

0.0000

0.0000 0.8574

-

LSf

0.0000 0.0000 0.0000

0.0095 0.0150

-

Conf

0.0000 0.0000 0.1044 0.9999

-

MaxConf 0.0000 0.0000 0.1445

-

InvVar

0.0001 0.0002

-

IRLSf

0.8657

-

IRLS

-

p-values of Tukey's HSD post-hoc test are given in Table 6.3. They show that

the performance difference between all methods is significant at the 0.01 level,

except for the ones that are written in italics.

The same experiment where preformed with K

, classifiers,

yielding qualitatively similar results. This shows that the presented performance

differences are not sensitive to the number of classifiers used.

∈{

20 , 100 , 400

}

6.3.3

Discussion

As can be seen from the results, IRLS is in almost all cases significantly better,

and in no case significantly worse than any other methods that were applied.

IRLSf uses more information than IRLS to mix the classifier predictions, and

thus can be expected to perform better. As can be seen from Table 6.2, however,

it frequently features worse performance, though not significantly. This worse

performance can be attributed to the used stopping criterion that is based on

the relative change of the likelihood between two successive iterations. This

likelihood increases more slowly when using IRLSf, which leads the stopping

criterion to abort learning earlier for IRLSf than IRLS, causing it to perform

worse.

InvVar is the best method of the introduced heuristics and constantly out-

performs LS and LSf. Even though it does not perform significantly better than

Conf and MaxConf, its mean is higher and the method relies on less assumpti-

ons. Thus, it should be the preferred method amongst the heuristics that were

introduced.

As expected, XCS features a worse performance than all other methods, which

can be attribute to the fact that the performance measure of the local model

is influenced by the performance of the local models that match the same in-

puts. This might introduce some smoothing, but it remains questionable if such

smoothing is ever advantageous. This doubt is justified by observing that XCS

Design and Analysis of Learning Classifier Systems

Search WWH ::

Custom Search

Home