Database Reference
In-Depth Information
Fig. 15.4 A Pseudo code for GA.
approaches. The initial population is generated by applying a standard
top-down decision tree inducers but attributes are selected in a dipolar
way. Specifically, two instances from different classes are randomly chosen.
Then an attribute which differentiates between the two instances is
selected.
The Fitness function is composed of two terms: the classification
accuracy on the training set and the tree complexity. Specifically, the fitness
function, which must be maximized, has the following form:
Fitness = Acc
α
·
S,
(15.1)
where Acc is the classification quality estimated on the learning set; S is the
size of the tree (number of nodes); and α — is a parameter which indicate
the relative importance of the complexity term. The value of α should be
provided by the user by tuning it to the specific problem that is solved.
The algorithm terminates if the fitness of the best individual in the
population does not improve during the fixed number of generations.
This status indicates, that the algorithm has converged. Additionally, the
maximum number of generations is specified, which allows limiting the
computation time in case of a slow convergence.
Like many other GAs, the GDT-EA also has two operators: MutateNode
(for mutation) and CrossTrees (for crossover). The first operator MutateN-
ode, which is applied with the given probability to every node of the tree,
can modify the test or change the node structure. If a non-leaf node is
Search WWH ::




Custom Search