An Algorithmic Description - Design and Analysis of Learning Classifier Systems

Information Technology Reference

In-Depth Information

subsequently by used by any global search algorithm that is able to find its ma-

ximum in the space of possible model structures, its algorithmic description is

kept separate from the model structure search. For the structure search, two sim-

ple alternatives are provided in a later section, one based on genetic algorithms,

and another based on sampling the model posterior p (

M|D

) by MCMC methods.

Finally, both approaches are applied to simple regression tasks to demonstrate

the usefulness of the classifier set optimality criterion.

8.1

Computing p ( M|D )

Let us start with a set of functions that allow the computation of an approxima-

tion to p (

. These functions

rely on a small set of global system parameters and constants that are given in

Table 8.1. The functions are presented in a top-down order, starting with a func-

tion that returns p (

M|D

)foragivendataset

D

and model structure

M

M|D

), and continuing with the sub-functions that it calls.

The functions use a small set of non-standard operators and global functions

that are described in Table 8.2.

Thedataisassumedtobegivenbythe N

×

D X

input matrix X and the

N

D Y output matrix Y , as described in Sect. 7.2.1. The model structure is

fully defined by the N

×

K matching matrix M ,thatisgivenby

Table 8.1. Description of the system parameters and constants. These include the

distribution parameters of the priors and hyperpriors, and constants that parametrise

the stopping criteria of parameter update iterations. The recommended values specify

rather uninformative priors and hyperpriors, such that the introduced bias due to these

priors is negligible.

Symbol

Recom. Description

10 − 2

a α

Scale parameter of weight vector variance prior

10 − 4

b α

Shape parameter of weight vector variance prior

10 − 2

a β

Scale parameter of mixing weight vector variance prior

10 − 4

b β

Shape parameter of mixing weight vector variance prior

10 − 2

a τ

Scale parameter of noise variance prior

10 − 4

b τ

Shape parameter of noise variance prior

− 4

Δ s L k ( q )

Stopping criterion for classifier update

− 2

Δ s L M ( q )

Stopping criterion for mixing model update

Δ s KL( R G )10 − 8

Stopping criterion for mixing weight update

exp min

−

lowest real number x on system such that exp( x ) > 0

ln max

−

ln( x ), where x is the highest real number on system

Design and Analysis of Learning Classifier Systems

Search WWH ::

Custom Search

Home