The Optimal Set of Classifiers - Design and Analysis of Learning Classifier Systems

Information Technology Reference

In-Depth Information

Thus, evaluating (7.26) gives

ln q ∗ W,α ( W k ,τ k )= D Y a τ −

ln τ k

D Y + D X D Y

+ D 2

r nk

2 D Y b τ +

2 w kj

τ k

r nk y nj −

−

r nk x n y nj

E α ( α k ) I +

+ w kj

r nk x n x n

w kj

+const.

=ln

a τ k ,b τ k ) ,

w kj , ( τ k Λ k ) − 1 )Gam( τ k |

( w kj |

(7.29)

with the distribution parameters

E α ( α k ) I +

Λ k =

r nk x n x n ,

(7.30)

w kj = Λ k − 1

r nk x n y nj ,

(7.31)

= a τ + 1

a τ k

r nk ,

(7.32)

⎛

⎝

⎞

⎠ .

2 D Y

b τ k

r nk y nj −

w kj T Λ k w kj

= b τ +

(7.33)

The second equality in (7.29) can be derived by expanding the final result and

replacing all terms that are independent of W k and τ k by a constant. The dis-

tribution parameter update equations are that of a standard Bayesian weighted

linear regression (for example, [19, 15, 72]).

Note that due to the use of conjugate priors, the variational posterior q ∗ W,α

( W k ,τ k ) (7.29) has the same distribution form as the prior p ( W k ,τ k |

α k )(7.8).

The resulting weight vector w kj , that models the relation between the inputs

and the j th component of the outputs, is given by a Gaussian with mean w kj

and precision τ k Λ k . The same posterior weight mean can be found by minimising

2 R k +

2 ,

Xw kj −

y j

E α ( α k )

w kj

(7.34)

with respect to w kj ,where R k is the diagonal matrix R k =diag( r 1 k ,...,r Nk ),

and y j is the vector of j th output elements, y j =( y 1 j ,...,y Nj ) T ,thatis,the j th

column of Y . This shows that we are performing a responsibility-weighted ridge

regression with ridge complexity

E α ( α k ). Thus, the shrinkage is determined by

the prior on α k , as can be expected from the specification of the weight vector

prior (7.8).

The noise precision posterior is the Gamma distribution Gam( τ k |

a τ k ,b τ k ).

νλ

χ ν

νλ

χ ν

Using the relation

∼

Gam( ν/ 2 ,νλ/ 2) , where

is the scaled inverse χ 2

Design and Analysis of Learning Classifier Systems

Search WWH ::

Custom Search

Home