Information Technology Reference
In-Depth Information
Would the classifier set optimality criterion that was introduced in Chap. 7
also provide us with a safeguard against divergence at the model structure level;
that is, would divergent classifiers be detected? In contrast to XCS(F), the cri-
terion that was presented does not assume a classifier to be a bad local model
as soon as its model error is above a certain threshold. Rather, the localisation
of a classifier is inappropriate if its model is unable to capture the apparent
pattern that is hidden in the noisy data. Therefore, it is not immediately clear
if the criterion would detect the divergent model as a pattern that the classifier
cannot model, or if it would assume it to be noise.
In any case, providing stability on the model structure level is to repair the
problem of divergence after it occurred, and relies on the assumption that chan-
ging the model structure does indeed provide us with the required stability. This
is not a satisfactory solution, which is why the focus should be on preventing
the problem from occurring at all, as discussed in the next section.
Stability on the Parameter Learning Level
Given a fixed model structure
, the aim is to provide parameter learning that
is guaranteed to converge when used with DP methods. Recall that both value
iteration and policy iteration are guaranteed to converge if the approximation
architecture is a non-expansion with respect to the maximum norm
M
· .It
being a non-expansion with respect to the weighted norm
· D , on the other
hand, is sucient for the convergence of the policy evaluation step of policy
iteration, but not value iteration. In order to guarantee stability of either method
when using LCS,theLCS approximation architecture needs to provide such a
non-expansion.
Observe that having a single classifier that matches all states is a valid mo-
del structure. In order for this model structure to provide a non-expansion, the
classifier model itself must form a non-expansion. Therefore, to ensure that the
LCS model provides the non-expansion property for any model structure, every
classifier model needs to form a non-expansion, and any mixture of a set of loca-
lised classifiers that forms the global LCS model needs to form a non-expansion
as well. Formally, if
·
denotes the norm in question, we need
Π V
V
Π V
V
(9.33)
to hold for any two V , V , where Π is the approximation operator of a given
LCS model structure. If the model structure is formed by a single classifier that
matches all states,
Π k V
V
Π k V
V
(9.34)
needs to hold for any two V , V ,whereΠ k is the approximation operator of a
single classifier. These requirements are independent of the LCS model type.
Returning to the LCS model structure with independently trained classifiers,
the next two sections concentrate on its non-expansion property, firstly with
respect to
· , and then with respect to
· D .
 
Search WWH ::




Custom Search