Information Technology Reference
In-Depth Information
4 A Probabilistic Model for LCS
Having conceptually defined the LCS model, it will now be embedded into a
formal setting. The formal model is initially designed for a fixed model structure
M
; that is, the number of classifiers and where they are localised in the input
space is constant during training of the model. Even though the LCS model could
be characterised purely by its functional form [78], a probabilistic model will be
developed instead. Its advantage is that rather than getting a point estimate
f ( x ) for the output y given some input x , the probabilistic model provides the
probability distribution p ( y
x , θ ) that for some input x and model parameters
θ describes the probability density of the output being the vector y .Fromthis
distribution its is possible to form a point estimate from its mean or its mode,
and additionally to get information about the certainty of the prediction by the
spread of the distribution.
This chapter concentrates on modelling the data by the principle of maximum
likelihood: given a set of observations
|
, the best model parameters θ
are the ones that maximise the probability of the observations given the model
parameters p (
D
=
{
X , Y
}
θ ). As described in the previous chapter this might lead to
overfitting the data, but nonetheless it results in a first idea about how the model
can be trained, and relates it closely to XCS, where overfitting is controlled on
the model structure level rather than the model parameter level (see App. B).
Chapter 7 generalises this model and introduces a training method that avoids
overfitting.
The formulation of the probabilistic model is guided by a related machine
learning model: the Mixtures-of-Expert (MoE) model [120, 121] fits the data
byafixednumberoflocalisedexperts.Even though not identified by previous
LCS research, there are strong similarities between LCS and MoE when relating
the classifiers of LCS to the experts of MoE. However, they differ in that the
localisation of the experts in MoE is changed by a gating network that assigns
observations to experts, whereas in LCS the localisation of classifiers is defined
by the matching functions and is fixed for a constant model structure. To relate
these two approaches, the model is modified such that it acts as a generalisation
to both the standard MoE model and LCS. Furthermore, diculties in training
the emerging model are solved by detaching expert training from training the
gating network.
D|
 
Search WWH ::




Custom Search