An Algorithmic Description - Design and Analysis of Learning Classifier Systems

Information Technology Reference

In-Depth Information

change. Picks one classifier of

M t at random, and re-initialises its matching

function by taking a sample from p ( m k ).

add. Adds one classifier to

M t , with a matching function sampled from p ( m k ),

resulting in K t + 1 classifiers.

remove. Removes one classifier from

M t at random, resulting in K t −

1 classifiers.

The actions are chosen by taking samples from the discrete random variable

∈{

change , add , remove

}

, where we assume p ( A = add) = p ( A =remove)and

p ( A = change) = 1

2 p ( A = add).

Let us now consider how to compute the acceptance probability (8.4) for each

of these actions. We have p (

−

K ) p ( K ) by Bayes' Theorem,

where, different to (7.3), we have separated the number of classifiers K from

the model structure

M|D

)

∝

p (

D|M

) p (

. As in (7.4), a uniform prior over unique models is as-

sumed, resulting in p ( K )

∝

1 /K !. Additionally, every classifier in

is created

K )= p ( m k ) K .

Using variational inference, the model evidence is approximated by the variatio-

nal bound p (

independently by sampling from p ( m k ), which results in p (

D|M

)

∝

exp(

L M ( q )), where

L M ( q ) denotes the variational bound

of model

. Thus, in combination we have

L M ( q )) p ( m k ) K ( K !) − 1

M |D

p (

)

exp(

) ≈

L M t ( q )) p ( m k ) K t ( K t !) − 1 ,

(8.5)

p (

M t |D

exp(

where K denotes the number of classifiers in M .

We get the model transition probability p ( M |M t ) by marginalising over the

actions A ,toget

M |M t )= p (

M |M t ,A = change) p ( A = change)

+ p (

p (

M |M t ,A = add) p ( A = add)

M |M t ,A =remove) p ( A =remove) ,

+ p (

(8.6)

M t |M ). When choosing action add ,then K =

and a similar expression for p (

M |M t ,A =remove)=0,asneit-

her the action change nor the action remove cause a classifier to be added.

M t and

M |M t ,A = change) = p (

K t +1, and p (

M differ in a single classifier that is picked from p ( m k ), and there-

M t |M ,A =add)= p ( m k ). Similarly, when choosing the action remove

fore p (

for

M t , an arbitrary classifier is picked with probability 1 /K t , and therefore

M |M t ,A =remove)=1 /K t . The action change requires choosing a clas-

sifier with probability 1 /K t and reinitialising it with probability p ( m k ), giving

p (

M t |M )canbe

evaluated by observing that the only possible action that causes the reverse tran-

sition from

M |M t ,A = change) = p ( m k ) /K t . The reverse transitions p (

M to

M t after the action add is the action remove ,andviceversa.

Equally, change causes the reverse transition after performing action change .

Overall, the candidate model

M that was created by add from

M t is accepted

by (8.4) with probability

min p (

) , 1

M t |M ,A =remove) p ( A =remove)

p (

M |D

p (

)

M |M t ,A = add) p ( A = add)

p (

M t |D

≈

min (exp (

L M ( q )

−L M t ( q )

−

2ln( K t +1)) , 1) ,

(8.7)

Design and Analysis of Learning Classifier Systems

Search WWH ::

Custom Search

Home