Mixing Independently Trained Classifiers - Design and Analysis of Learning Classifier Systems

Information Technology Reference

In-Depth Information

w = c − 1

m ( x n ) y n ,

(6.31)

n =1

given all N observations. Hence, (6.31) is the batch formulation for the solution

that the incremental (6.30) approximates.

Applying this relation to the XCS update equations for the mixing parameters,

the mixing model employed by XCS(F) can be described as follows: The error

k of classifier k in XCS(F) is the mean absolute prediction error of its local

models, and is given by

m ( x n ) y n −

w k x n .

k = c − 1

(6.32)

n =1

The classifier's accuracy is some inverse function κ ( k ) of the classifier error.

This function was initially given by an exponential [237], but was later [239, 57]

redefined to

⎧

⎨

if < 0 ,

α 0 −ν

κ ( )=

(6.33)

⎩

otherwise ,

where the constant scalar 0 is known as the minimum error , the constant α is a

scaling factor, and the constant ν is a mixing power factor [57]. The accuracy is

constantly1uptotheerror 0 and then drops off steeply, with the shape of the

drop determined by α and ν .The relative accuracy is a classifier's accuracy for

a single input normalised by the sum of the accuracies of all classifiers matching

that input. The fitness is the relative accuracy of a classifier averaged over all

inputs that it matches, that is

m k ( x n ) κ ( k )

j =1 m j ( x n ) κ ( j )

F k = c − 1

(6.34)

n =1

This fitness is the measure of a classifier's prediction quality, and hence γ k is

input-independently given by γ k ( x )= F k .

Note that the magnitude of a relative accuracy depends on both the error

of a classifier, and on the error of the classifiers that match the same input.

This makes the fitness of classifier k dependent on inputs that are matched by

classifiers that share inputs with classifier k , but are not necessarily matched by

this classifier. This might be a good measure for the fitness of a classifier (where

prediction quality is not all that counts), but it does not perform too well as a

measure of the prediction quality of a classifier.

6.3

Empirical Comparison

In order to compare how well the different heuristics perform with respect to the

aim of maximising (6.1), their performance is evaluated on a set of four regression

tasks. The results show that i) mixing by inverse variance outperforms the other

Design and Analysis of Learning Classifier Systems

Search WWH ::

Custom Search

Home