A Probabilistic Model for LCS - Design and Analysis of Learning Classifier Systems

Information Technology Reference

In-Depth Information

As classifiers can only generate observations if they match the corresponding

input, the classifier model itself does not require any modification. Additionally,

(4.9) is still valid, as z k = 1 only if m k = 1 by (4.20). Figure 4.3 shows the

graphical model that, when compared to Fig. 4.1, illustrates the changes that

are introduces by generalising the MoE model.

4.3.2

Updated Expectation-Maximisation Training

The only modifications to the standard MoE are changes to the gating network,

expressed by g k . As (4.12), (4.13) and (4.14) are independent of the functional

form of g k , they are still valid for the generalised MoE. Therefore, the expecta-

tion step of the EM-algorithm is again performed by evaluating the responsibi-

lities by (4.12), and the gating and classifier models are updated by (4.13) and

(4.14). Convergence of the algorithm is again monitored by (4.9).

4.3.3

Implications on Localisation

Localisation of the classifiers is achieved on one hand by the matching function

of the classifiers, and on the other hand by the combined training of gating

networks and classifiers.

Let us first consider the case when the n th observation ( x n , y n )ismatchedby

one and only one classifier k ,thatis m j ( x n )=1onlyif j = k ,and m j ( x n )=0

otherwise. Hence, by (4.22), g j ( x n )=1onlyif j = k ,and g j ( x n )=0otherwise,

and consequently by (4.12), r nj = 1 only if j = k ,and r nj =0otherwise.

Therefore, full responsibility for the observation is given to the one and only

matching classifier, independent of its goodness-of-fit.

On the other hand, assume that the same observation ( x n , y n )ismatchedby

all classifiers, that is m j ( x n ) = 1 for all j

, and assume the identity

transfer function φ ( x )= x . In that case, (4.22) reduces to the standard MoE

gating network (4.5) and we perform a soft linear partitioning as described in

Sect. 4.1.4.

In summary, localisation by matching determines for which areas of the in-

put space the classifiers attempt to model the observations. In areas where they

match, they are distributed by soft linear partitions as in the standard MoE

model. Hence, we can acquire a two-layer intuition on how localisation is perfor-

med: Matching determines the rough areas where classifiers are responsible to

model the observations, and the softmax function then performs the fine-tuning

in areas of overlap between classifiers.

∈{

1 ,...,K

}

4.3.4

Relation to Standard MoE Model

The only difference between the generalised MoE model and the standard MoE

model is the definition of the gating model g k . Comparing the standard model

(4.5) with its generalisation (4.22), the standard model is recovered from the

generalisation by having m k ( x ) = 1 for all k and x , and the identity transfer

function φ ( x )= x for all x . Defining the matching functions in such a way is

Design and Analysis of Learning Classifier Systems

Search WWH ::

Custom Search

Home