The Optimal Set of Classifiers - Design and Analysis of Learning Classifier Systems

Information Technology Reference

In-Depth Information

Note that r nk gives the responsibility that is assigned to classifier k for model-

ling observation n , and is proportional to ρ nk (7.62). Thus, the responsibilities

are on one hand proportional to the current mixing weights g k ( x ),andonthe

other hand are higher for low-variance classifiers (note that τ k is the inverse

variance of classifier k ) that feature a low expected squared prediction error

( y nj −

w kj x n ) 2 for the associated observation. Overall, the responsibilities are

distributed such that the observations are modelled by the classifiers that are

best at modelling them.

7.3.7

Required Moments of the Variational Posterior

Some of the variational distribution parameters require evaluation of the mo-

ments of one or the other random variable in our probabilistic model. In this

section, these moments and the ones required at a later stage are evaluated.

Throughout this section we use E x ( x )= x ∗ and cov x ( x , x )= Λ − 1 ,where

( x ∗ , Λ − 1 ) is a random vector that is distributed according to a multi-

variate Gaussian with mean x ∗ and covariance matrix Λ − 1 .

Given that we have a random variable X

∼N

∼

Gam( a, b ), then its expectation

E X ( X )= a/b , and the expectation of its logarithm is

E X (ln X )=

( a )

−

ln b ,

( x )= d x ln Γ( x ) is the digamma function [19]. Thus the following are

the posterior moments for q α ( α k ), q β ( β k ), and q τ ( τ k ):

where

E α ( α k )= a α k

b α k

(7.64)

( a α k )

E α (ln α k )=

−

ln b α k ,

(7.65)

E β ( β k )= a β k

b β k

(7.66)

( a β k )

ln b β k ,

E β (ln β k )=

−

(7.67)

E τ ( τ k )= a τ k

b τ k

(7.68)

( a τ k )

ln b τ k .

E τ (ln τ k )=

−

(7.69)

To get the moments of q ∗ W,τ ( W k ,τ k )and q V ( v k ), we can use var( X )=

( X 2 )

− E

( X ) 2 , and thus,

( X 2 )=var( X )+

( X ) 2 ,toget

( x T x )=

( x i )

i E

var( x i )+

( x i ) 2

i E

( x ) T

=Tr(cov( x , x )) +

( x ) ,

and similarly,

( xx T )=cov( x , x )+

( x ) T ,

( x )

Design and Analysis of Learning Classifier Systems

Search WWH ::

Custom Search

Home