A Probabilistic Model for LCS - Design and Analysis of Learning Classifier Systems

Information Technology Reference

In-Depth Information

4 A Probabilistic Model for LCS

Having conceptually defined the LCS model, it will now be embedded into a

formal setting. The formal model is initially designed for a fixed model structure

M

; that is, the number of classifiers and where they are localised in the input

space is constant during training of the model. Even though the LCS model could

be characterised purely by its functional form [78], a probabilistic model will be

developed instead. Its advantage is that rather than getting a point estimate

f ( x ) for the output y given some input x , the probabilistic model provides the

probability distribution p ( y

x , θ ) that for some input x and model parameters

θ describes the probability density of the output being the vector y .Fromthis

distribution its is possible to form a point estimate from its mean or its mode,

and additionally to get information about the certainty of the prediction by the

spread of the distribution.

This chapter concentrates on modelling the data by the principle of maximum

likelihood: given a set of observations

|

, the best model parameters θ

are the ones that maximise the probability of the observations given the model

parameters p (

D

=

{

X , Y

}

θ ). As described in the previous chapter this might lead to

overfitting the data, but nonetheless it results in a first idea about how the model

can be trained, and relates it closely to XCS, where overfitting is controlled on

the model structure level rather than the model parameter level (see App. B).

Chapter 7 generalises this model and introduces a training method that avoids

overfitting.

The formulation of the probabilistic model is guided by a related machine

learning model: the Mixtures-of-Expert (MoE) model [120, 121] fits the data

byafixednumberoflocalisedexperts.Even though not identified by previous

LCS research, there are strong similarities between LCS and MoE when relating

the classifiers of LCS to the experts of MoE. However, they differ in that the

localisation of the experts in MoE is changed by a gating network that assigns

observations to experts, whereas in LCS the localisation of classifiers is defined

by the matching functions and is fixed for a constant model structure. To relate

these two approaches, the model is modified such that it acts as a generalisation

to both the standard MoE model and LCS. Furthermore, diculties in training

the emerging model are solved by detaching expert training from training the

gating network.

D|

Design and Analysis of Learning Classifier Systems

Search WWH ::

Custom Search

Home