An Algorithmic Description - Design and Analysis of Learning Classifier Systems

Information Technology Reference

In-Depth Information

Function. ModelProbability( M , X , Y , Φ )

Input : matching matrix M , input matrix X , output matrix Y , mixing feature

matrix Φ

Output : approximate model probability

( q )+ln p (

)

get K from shape of M

for k ← 1 to K do

m k ← k th column of M

W k , Λ k − 1 ,a τ k ,b τ k ,a α k ,b α k

← TrainClassifier( m k , X , Y )

W , Λ − 1

←{ W 1 ,..., W K }, { Λ − 1 ,..., Λ − K

}

a τ , b τ

←{a τ 1 ,...,a τ K }, {b τ 1 ,...,b τ K }

a α , b α ←{a α 1 ,...,a α K }, {b α 1 ,...,b α K }

V , Λ − V a β , b β ← TrainMixing( M , X , Y , Φ , W , Λ − 1 , a τ , b τ , a α , b α )

θ ←{ W , Λ − 1 , a τ , b τ , a α , b α , V , Λ − 1

a β , b β }

L ( q ) ← VarBound( M , X , Y , Φ , θ )

return L ( q )+ln K !

ln p (

M|D

). Thus, it replaces the model evidence p (

D|M

) in (7.3) by its appro-

ximation

( q ). The function assumes that the order of the classifiers can be

arbitrarily permuted without changing the model structure and therefore uses

the p (

), the function does not

add the normalisation constant. Hence, even though the return values are not

proper probabilities, they can still be used for the comparison of different model

structures, as the normalisation term is shared between all of them.

The computation of

) given by (7.4). In approximating ln p (

M|D

( q )+ln p (

) is straightforward: Lines 2 to 7 compute

and assemble the parameters of the classifiers by calling TrainClassifier for

each classifier k separately, and provide it with the data and the matching vector

m k for that classifier. After that, the mixing model parameters are computed in

Line 8 by calling TrainMixing , based on the fully trained classifiers.

Having evaluated all classifiers, all parameters are collected in Line 9 to give

θ and are used in Line 10 to compute

( q ) by calling VarBound .Afterthat,the

function returns

( q )+ln K !, based on (7.3) and (7.4).

8.1.2

Training the Classifiers

The Function TrainClassifier takes the data X , Y and the matching vector

m k and returns all model parameters for the trained classifier k . The model

parameters are found by iteratively updating the distribution parameters of the

variational posteriors q ∗ W,τ ( W k ,τ k )and q α ( α k ) until the convergence criterion

is satisfied. This criterion is given by the classifier-specific components

L k ( q )of

the variational bound

( q ), as given by (7.91). However, rather than evaluating

L k ( q ) with the responsibilities r nk , as done in (7.91), the matching function

m k ( x n ) are used instead. The underlying idea is that - as each classifier is trai-

ned independently - the responsibilities are equivalent to the matching function

values. This has the effect that by updating the classifier parameters according

Design and Analysis of Learning Classifier Systems

Search WWH ::

Custom Search

Home