Training the Classifiers - Design and Analysis of Learning Classifier Systems

Information Technology Reference

In-Depth Information

Note that λ j = n +1 m ( x j ) is used rather than simply λ N−n to only decay past

observations if the current observation is matched. As before, the solution w N

that minimises (5.38) satisfies

( X N M N X N ) w N = X N M N y N .

(5.40)

Using Λ N = X N M N X N and the relations

Λ N +1 = λ m ( x N +1 ) Λ N + m ( x N +1 ) x N +1 x N +1 ,

(5.41)

Λ N +1 w N +1 = λ m ( x N +1 ) Λ N w N + m ( x N +1 ) x N +1 y N +1 ,

(5.42)

the recency-weighted RLS weight vector update is given by

w N x N +1 ) .

w N +1 = λ m ( x N +1 ) w N + m ( x N +1 ) Λ − N +1 x N +1 ( y N +1 −

(5.43)

The matrix Λ can be updated by either using (5.41) or by applying the Sherman-

Morrison formula to get

Λ − N +1 = λ −m ( x N +1 ) Λ − N

(5.44)

Λ − N x N +1 x N +1 Λ − N

λ m ( x N +1 ) + m ( x N +1 ) x N +1 Λ − N x N +1

m ( x N +1 ) λ −m ( x N +1 )

−

All equations from this section reduce to the non-recency-weighted equivalents

if λ =1.

In summary, the RLS algorithm recursively tracks the solution according to

the Principle of Orthogonality. As this solution is always optimal in the least

squares sense, there is no need to discuss its convergence to the optimal solution,

as was required for gradient-based algorithms. While the RLS can also be adjus-

ted to perform recency-weighting, as developed in this section, its only drawback

when compared to the LMS or normalised LMS algorithm is its higher compu-

tational cost. Nonetheless, if this additional cost is bearable, it should be always

preferred to the gradient-based algorithm, as will be demonstrated in Sect. 5.4.

Example 5.6 (RLS Algorithm for Averaging Classifiers). Consider averaging

classifiers, such that x n =1forall n> 0. Hence, (5.31) becomes

Λ N +1 = Λ N + m ( x N +1 ) ,

(5.45)

which, when starting with Λ 0 = 0 is equivalent to the match count Λ N = c N .

The weight update after (5.34) reduces to

w N +1 = w N + m ( x N +1 ) c − N +1 ( y N +1 −

w N ) .

(5.46)

Note that this is equivalent to the LMS algorithm (5.25) for averaging classifiers

when using the step size γ N = c − N

. By recursive back-substitution of the above,

and using w 0 =0,weget

w N = c − N

m ( x N +1 ) y n ,

(5.47)

n =1

Design and Analysis of Learning Classifier Systems

Search WWH ::

Custom Search

Home