Information Technology Reference
In-Depth Information
does). Rather, the globally matching classifier should be rearranged such that
it does not directly compete with the specific classifier in modelling its part of
the data. The resulting pair of classifiers would then cooperate to model a part
of the data and can be seen as a building block of a potentially good model
structure. Thus, while these building blocks exist, they are not exploited when
using the MCMC algorithm for model structure search.
When using a GA for model structure search, on the other hand, the po-
pulation of individuals can contain several potentially useful building blocks,
and it is the responsibility of the crossover operator to identify and recombine
them. As shown by Syswerda [210], uniform crossover generally yields better re-
sults that one-point and two-point crossover. The crossover operator that is used
aims at uniform crossover for variable-length individuals. Further improvement
in identifying building blocks can be made by using Estimation of Distribution
Algorithms (EDAs) [183], but as there are currently no EDAs that directly apply
to the problem structure at hand [150] this topic requires further investigation.
8.3
Empirical Demonstration
To demonstrate the usefulness of the optimality criterion that was introduced
in the last chapter, the previously described algorithms are used to find a good
set of classifiers for a set of simple regression tasks. These tasks are kept simple
in the sense that the number of classifiers that are expected to be required
are low, such that the
( K 3 )complexityof ModelProbability does not cause
any computational problems. Additionally, the crudeness of the model structure
search procedures does not allow us to handle problems where the best solution is
given by a complex agglomeration of classifiers. All regression tasks have D X =1
and D Y = 1 such that the results can be visualised easily. The mixing features
are given by φ ( x ) = 1 for all x . Not all functions are standardised, but their
domain is always within [-1:4] and their range is within [-1:1]. For all experiments,
classifiers that model straight lines are used, together with uninformative priors
and hyperpriors as given in Table 8.1.
Even though the prime problems that most new LCS are tested against are
Multiplexer problems of various lengths [237], they are a challenge for the model
structure search rather than the optimality criterion and thus are not part of the
provided test set. Rather, a significant amount of noise is added to the data, as
the aim is to provide a criterion that defines the minimal model, and can separate
the underlying patterns from the noise, given that enough data is available.
Firstly, two different representations that are used for the matching functions
are introduced. Then, the four regression tasks, their aim, and the found results
are described, one by one.
O
8.3.1
Representations
The two representations that are going to be used are matching by radial-bases
functions, and matching by soft intervals. Starting with matching by radial-basis
 
Search WWH ::




Custom Search