An Algorithmic Description - Design and Analysis of Learning Classifier Systems

Information Technology Reference

In-Depth Information

k x n = i ( x n ) i ( Λ − k x n ) i .Thevaluesof

g k ( x n ) are added to ρ nk in Line 6, and the normalisation step by (7.63) is per-

formed in Line 7. For the same reason as in the Mixing function, all NaN values

in R need to be subsequently replaced by 0 to not assign responsibility to any

classifiers for inputs that are not matched.

XΛ − 1

, based on x n Λ − 1

the rows of X

⊗

Function. TrainMixWeights( M , X , Y , Φ , W , Λ − 1 , a τ , b τ , V , a β , b β )

Input : matching matrix M , input matrix X , output matrix Y , mixing feature

matrix Φ , classifier parameters W , Λ − 1 , a τ , b τ , mixing weight matrix

V , mixing weight prior parameters a β , b β

Output : D V × K mixing weight matrix V ,( KD V ) × ( KD V ) mixing weight

covariance matrix Λ − 1

E β ( β ) ← row vector with elements a β 1

b β 1 ,..., a β K

b β K

G ← Mixing( M , Φ , V )

R ← Responsibilities( X , Y , G , W , Λ − 1 , a τ , b τ )

KL( R G )

←∞

Δ KL( R G ) ← Δ s KL( R G )+1

while ΔKL ( R G ) >Δ s KL ( R G ) do

E ← Φ T ( G − R )+ V ⊗ E β ( β )

e ← ( E 11 ,..., E D V 1 , E 12 ,..., E D V 2 ,..., E 1 K ,..., E D V K ) T

H ← Hessian( Φ , G , a β , b β )

Δ v ←− H − 1 e

Δ V ← D V

× K matrix with jk th element

given by (( k − 1) K + j )th element of Δ v

V ← V + Δ V

G ← Mixing( M , Φ , V )

R ← Responsibilities( X , Y , G , W , Λ − 1 , a τ , b τ )

KL prev ( R G ) ← KL( R G )

KL( R G ) ← Sum( R ⊗ FixNaN( ln( G R ) , 0 ))

Δ KL( R G )= | KL prev ( R G ) − KL( R G ) |

H ← Hessian( Φ , G , a β , b β )

Λ − 1

← H − 1

return V , Λ − 1

The Function TrainMixWeights approximates the mixing weights variational

posterior q V ( V ) (7.51) by performing the IRLS algorithm. It takes the matching

matrix, the data and mixing feature matrix, the trained classifier parameters,

the mixing weight matrix, and the mixing weight prior parameters. As the IRLS

algorithm performs incremental updates of the mixing weights V until conver-

gence, V is not re-initialised every time TrainMixWeights is called, but rather

the previous estimates are used as their initial values to reduce the number of

iterations that are required until convergence.

As the aim is to model the responsibilities by finding mixing weights that

make the mixing coecients given by g k ( x n ) similar to r nk , convergence is

Design and Analysis of Learning Classifier Systems

Search WWH ::

Custom Search

Home