A Probabilistic Model for LCS - Design and Analysis of Learning Classifier Systems

Information Technology Reference

In-Depth Information

m nk

m k

z nk

v k

x n

θ k

y n

classifiers

data

Fig. 4.3. Directed graphical model of the generalised Mixtures-of-Experts model. See

the caption of Fig. 4.1 for instructions on how to read this graph. When compared to

the Mixtures-of-Expert model in Fig. 4.1, the latent variables z nk depends additionally

on the matching random variables m nk , whose values are determined by the mixing

functions m k and the inputs x n

that is, the value of a classifier's matching function determines the probability

of that classifier matching a certain input.

To enforce matching, the probability for classifier k having generated obser-

vation ( x , y ), given by (4.4), is redefined to be

⎧

⎨

exp( v k φ ( x ))

if m k =1for x ,

p ( z k =1

x , v k ,m k )

∝

(4.20)

⎩

otherwise ,

where φ is a transfer function, whose purpose will be explained later and which

can for now be assumed to be the identity function, φ ( x )= x .Thus,thediffe-

rences from the previous definition (4.4) are the additional transfer function and

the condition on m k that locks the generation probability to 0 if the classifier

does not match the input. Removing the condition on m k by marginalising it

out results in

g k ( x )

≡

p ( z k =1

x , v k )

∝

p ( z k =1

x , v k ,m k ) p ( m k = m

x )

m∈{ 0 , 1 }

=0+ p ( z k =1

x , v k ,m k ) p ( m k =1

x )

= m k ( x )exp( v k φ ( x )) .

(4.21)

Adding the normalisation term, the gating network is now defined by

m k ( x )exp( v k φ ( x ))

j =1 m j ( x )exp( v j φ ( x ))

g k ( x )

≡

p ( z k =1

x , v k )=

(4.22)

As can be seen when comparing it to (4.5), the additional layer of localisation is

specified by the matching function, which reduces the gating to g k ( x )=0ifthe

classifier does not match x ,thatis,if m k ( x )=0.

Design and Analysis of Learning Classifier Systems

Search WWH ::

Custom Search

Home