The Optimal Set of Classifiers - Design and Analysis of Learning Classifier Systems

Information Technology Reference

In-Depth Information

7.3.8

The Variational Bound

( q )

We are most interested in finding the value for

( q ) by (7.21), as it provi-

des us with an approximated lower bound on the logarithm of the model evi-

dence ln p ( Y ), and is the actual expression that is to be maximised. Evaluating

(7.21) by using the distribution decomposition according to (7.15), the variatio-

nal bound is given by

( q )=

q ( U )ln p ( Y , U )

q ( U )

d U

E W,τ,α,Z,V,β (ln p ( Y , W , τ , Z , V , β ))

− E W,τ,α,Z,V,β (ln q ( W , τ , α , Z , V , β ))

E W,τ,Z (ln p ( Y

W , τ , Z )) +

E W,τ,α (ln p ( W , τ

α )) +

E α (ln p ( α ))

E Z,V (ln p ( Z

V )) +

E V,β (ln p ( V

β )) +

E β (ln p ( β ))

− E W,τ (ln q ( W , τ ))

− E α (ln q ( α ))

− E Z (ln q ( Z ))

− E V (ln q ( V ))

− E β (ln q ( β )) ,

(7.75)

where all expectations are taken with respect to the variational distribution q .

These are evaluated one by one, using the previously derived moments of the

variational posteriors.

To derive

E W,τ,Z (ln p ( Y

W , τ , Z )), we use (7.6) and (7.7) to get

E W,τ,Z (ln p ( Y

W , τ ))

E Z ( z nk )

w kj x n ,τ − 1

E W,τ (ln

( y nj |

))

2 E τ (ln τ k )

w kj x n ) 2 )

r nk

2 ln 2 π

2 E W,τ ( τ k ( y nj −

−

D 2

ln 2 π

( a τ k )

ln b τ k −

−

r nk

w kj T x n ) 2 + x n Λ k − 1 x n

a τ k

b τ k

r nk

−

( y nj −

D 2

ln 2 π

( a τ k )

ln b τ k −

−

r nk

2 + D Y x n Λ k − 1 x n .

r nk a τ k

W k x n

−

b τ k

y n −

(7.76)

The classifier model parameters expectation

E W,τ,α (ln p ( W , τ

α )) can be de-

rived by using (7.7) and (7.16), and is given by

E W,τ,α (ln p ( W , τ

α ))

(7.77)

E W,τ,α (ln

a τ ,b τ )) .

0 , ( α k τ k ) − 1 I )) +

( w kj |

E τ (ln Gam( τ k |

Design and Analysis of Learning Classifier Systems

Search WWH ::

Custom Search

Home