Information Technology Reference
In-Depth Information
which is, as already derived for batch learning (5.14), the matching-weighted
average over all observed outputs.
Interestingly, XCS applies the MAM update that is equivalent to averaging
the input for the first γ 1 inputs, where γ is the step size, and then tracking the
input using the LMS algorithm [237]. In other words, it bootstraps its weight
estimate using the RLS algorithm, and then continues tracking of the output
using the LMS algorithm. Note that this is only the case for XCS with averaging
classifiers, and does not apply for XCS derivatives that use more complex models,
such as XCSF. Even though not explicitly stated by Wilson [241] and others, it
is assumed that the MAM update was not used for the weight update in those
XCS derivatives, but is still used when updating its scalar parameters, such as
the relative classifier accuracy and fitness.
5.3.6
The Kalman Filter
The RLS algorithm was introduced purely on the basis of the Principle of Or-
thogonality without consideration of the probabilistic structure of the random
variables. Even though the Kalman filter results in the same update equations,
it provides additional probabilistic information and hence supports better un-
derstanding of the method. Furthermore, its use is advantageous as “[ ... ]the
Kalman filter is optimal with respect to virtually any criterion that makes sense”
[164, Chap. 1].
Firstly, the system model is introduced, from which the update equation in
covariance form and inverse covariance form are derived. This is followed by
considering how both the system state and the measurement noise can be esti-
mated simultaneously by making use of the Minimum Model Error philosophy.
The resulting algorithm is finally related to the RLS algorithm.
The System Model
The Kalman-Bucy system model [123, 124] describes how a noisy process mo-
difies the state of a system, and how this affects the noisy observation of the
system. Both the process and the relation between system state and observation
is assumed to be linear, and all noise is zero-mean white (uncorrelated) Gaussian
noise.
In our case, the process that generates the observations is assumed to be
stationary, which is expressed by a constant system state. Additionally, the ob-
servations are in linear relation to the system state and all deviations from that
linearity are covered by zero-mean white (uncorrelated) Gaussian noise. The
resulting model is
υ n = ω T x n + n ,
(5.48)
where υ n is the random variable that represents the observed n th scalar output
of the system, ω is the system state random variable, x n is the known n th input
vector to the system, and n is the measurement noise associated with observing
y n .
Search WWH ::




Custom Search