A Probabilistic Model for LCS - Design and Analysis of Learning Classifier Systems - page 56

Information Technology Reference

In-Depth Information

g1(x)

g2(x)

1

1

0.8

0.8

0.6

0.6

0.4

-4

0.4

-4

-2

-2

0.2

0.2

0

0

x1

x1

0

0

-4

2

-4

2

-2

-2

0

0

4

4

2

2

4

4

x2

x2

(a)

(b)

Fig. 4.4. Plots showing the generalised softmax function (4.22) for 2 classifiers with

inputs x =(1 ,x 1 ,x 2 ) T and φ ( x )= x , where Classifier 1 in plot (a) has gating parame-

ters v 1 =(0 , 0 , 1) T and matches a circle of radius 3 around the origin, and Classifier 2

in plot (b) has gating parameters v 2 =(0 , 1 , 0) T

and matches all inputs

and m 2 ( x ) = 1 for all x . Therefore, Classifier 1 matches a circle of radius 3

around the origin, and Classifier 2 matches the whole input space. The values

for g 1 ( x )and g 2 ( x ) are shown in Figs. 4.4(a) and 4.4(b), respectively. As can

be seen, the whole part of the input space that is not matched by Classifier 1

is fully assigned to Classifier 2 by g 2 ( x ) = 1. In the circular area where both

classifiers match, the softmax function performs a soft linear partitioning of the

input space, just as in Fig. 4.2.

The effect of changing the transfer function to φ ( x ) = 1 is visualised in

Fig. 4.5, and shows that in such a case no linear partitioning takes place. Rat-

her, in areas of the input space that both classifiers match, (4.22) assigns the

generation probabilities input-independently in proportion the exponential of

the gating parameters v 1 =0 . 7and v 2 =0 . 3.

Besides localisation beyond matching, the generalised MoE model has another

feature that distinguishes it from any previous LCS 3 : it allows for matching

by a degree of the range [0 , 1] rather than by just specifying where a classi-

fier matches and where it does not (as, for example, specified by set

X k and

(3.9)). Additionally, by (4.19), this degree has the well-defined meaning of the

probability p ( m k =1

|

x ) of classifier k matching input x . Alternatively, by ob-

serving that

E

( m k |

x )= p ( m k =1

|

x ), this degree can also be interpreted as the

3 While Butz seems to have experimented with matching by a degree in [41], he does

not describe how it is implemented and only states that “Preliminary experiments in

that respect [ ... ] did not yield any further improvement in performance”. Further-

more, his hyper-ellipsoidal conditions [41, 52] might look like matching by degree

on initial inspection, but as he determines matching by a threshold on the basis

function, matching is still binary. Fuzzy LCS (for example, [60]), on the other hand,

provide matching by degree but are usually not developed from the bottom up which

makes modifying the parameter update equations dicult.

Next Page

Design and Analysis of Learning Classifier Systems

Search WWH ::

Custom Search

Home