Information Technology Reference
In-Depth Information
2.3
The LCS Renaissance
Before introducing XCS, Wilson developed ZCS [236] as a minimalist classifier
systems that aimed through its reductionist approach to provide a better under-
standing of the underlying mechanisms. ZCS still uses classifier fitness based on
strength by using a version of the implicit bucket brigade for credit assignment,
but utilises fitness sharing to penalise overly general classifiers.
Only a year after having published ZCS, Wilson introduced his XCS [237] that
significantly influenced future LCS research. Its distinguishing feature is that the
fitness of a classifier is not its strength anymore, but its accuracy in predicting
the expected reward 2 . Consequently, XCS does maintain information about low-
rewarding areas of the environment and penalises classifiers that match overly
large areas, as their reward prediction becomes inaccurate. By using a niche
GA that restricts the reproduction of classifiers to the currently observed state
and promote the performed action, and removing classifiers independent of their
matching, XCS prefers classifiers that match more states as long as they are still
accurate, thus aiming towards optimally general classifiers 3 . More information
about Wilson's motivation for the development, and an in-depth description of
its functionality can be found in Kovacs' Ph.D. thesis [133]. A short introduction
to XCS from the model-based perspective is given in App. B.
After its introduction, XCS was frequently modified and extended, and its
theoretical properties and exact working analysed. This makes it, at the time
of this writing, the most used and best analysed LCS available. These modifi-
cations also enhanced the intuitive understanding of the role of the classifiers
within the system, and as the proposed LCS model borrows much of its design
and intuition from XCS, the following sections give further background on the
role of a classifier in XCS and its extensions. In the following, only single-step
tasks, where a reward is received after each action, are considered. The detailed
description of multi-step tasks is postponed to Chap. 9.
2.3.1
Computing the Prediction
Initially, each classifier in XCS only provided a single prediction for all states that
it matches, independent of the nature of these states [237, 238, 239]. In XCSF
[240, 241], this was extended such that each classifier represents a straight line
and thus is able to vary its prediction over the states that it matches, based
on the numerical value of the state. This concept was soon picked up by other
researchers and was quickly extended to higher-order polynomials [141, 142, 143],
2 Using measures different than strength for fitness was already suggested before but
was never implemented in the form of pure accuracy. Even in the first LCS paper,
Holland suggested that fitness should be based not only on the reward but also
on the consistency of the prediction [111], which was also implemented [116]. Later,
however, Holland focused purely on strength-based fitness [237]. A further LCS that
uses some accuracy-like fitness measure is Booker's GOFER-1 [21].
3 Wilson and others calls optimally general classifiers maximally general [237], which
could lead to the misinterpretation that these classifiers match all states.
Search WWH ::




Custom Search