Training the Classifiers - Design and Analysis of Learning Classifier Systems

Information Technology Reference

In-Depth Information

which is, as already derived for batch learning (5.14), the matching-weighted

average over all observed outputs.

Interestingly, XCS applies the MAM update that is equivalent to averaging

the input for the first γ − 1 inputs, where γ is the step size, and then tracking the

input using the LMS algorithm [237]. In other words, it bootstraps its weight

estimate using the RLS algorithm, and then continues tracking of the output

using the LMS algorithm. Note that this is only the case for XCS with averaging

classifiers, and does not apply for XCS derivatives that use more complex models,

such as XCSF. Even though not explicitly stated by Wilson [241] and others, it

is assumed that the MAM update was not used for the weight update in those

XCS derivatives, but is still used when updating its scalar parameters, such as

the relative classifier accuracy and fitness.

5.3.6

The Kalman Filter

The RLS algorithm was introduced purely on the basis of the Principle of Or-

thogonality without consideration of the probabilistic structure of the random

variables. Even though the Kalman filter results in the same update equations,

it provides additional probabilistic information and hence supports better un-

derstanding of the method. Furthermore, its use is advantageous as “[ ... ]the

Kalman filter is optimal with respect to virtually any criterion that makes sense”

[164, Chap. 1].

Firstly, the system model is introduced, from which the update equation in

covariance form and inverse covariance form are derived. This is followed by

considering how both the system state and the measurement noise can be esti-

mated simultaneously by making use of the Minimum Model Error philosophy.

The resulting algorithm is finally related to the RLS algorithm.

The System Model

The Kalman-Bucy system model [123, 124] describes how a noisy process mo-

difies the state of a system, and how this affects the noisy observation of the

system. Both the process and the relation between system state and observation

is assumed to be linear, and all noise is zero-mean white (uncorrelated) Gaussian

noise.

In our case, the process that generates the observations is assumed to be

stationary, which is expressed by a constant system state. Additionally, the ob-

servations are in linear relation to the system state and all deviations from that

linearity are covered by zero-mean white (uncorrelated) Gaussian noise. The

resulting model is

υ n = ω T x n + n ,

(5.48)

where υ n is the random variable that represents the observed n th scalar output

of the system, ω is the system state random variable, x n is the known n th input

vector to the system, and n is the measurement noise associated with observing

y n .

Design and Analysis of Learning Classifier Systems

Search WWH ::

Custom Search

Home