Database Reference
In-Depth Information
the error is smaller for CTC than for C4.5. The statistically significant differences
(paired t-test [5],[6]), with 95% confidence level, have been marked in italics. The
differences are statistically significant in 11 databases for C4.5 100 , and 10 databases
for C4.5 union . In the databases where results for C4.5 100 or C4.5 union are better, the
differences are not statistically significant. The differences with results of
C4.5 not_resampling are never statistically significant being the behaviour of CTC better in
average. So we can ensure that the discriminating capacity of CTC algorithm is at
least as good or better than the one of C4.5 . In this situation, it is worth the
comparison of the structural stability of the different classifiers. Achieving greater
structural stability will mean that CT trees have better explaining capacity. The data
show that CTs achieve higher structural stability than C4.5 100 (in average 8.46
compared to 3.24) and C4.5 not_resampling (in average 8.46 compared to 5.60).
Looking to the values of Common obtained for C4.5 union we could say that they
achieve higher structural stability than CTC ( Common is in average 23.44 compared
to 8.46) but this happens because complexity of C4.5 union trees is an order of
magnitude larger than the complexity of CTs. In environments where explanation and
therefore stability is important so complex trees are not useful. Moreover, being the
error smaller for CTC, the principle of parsimony of the model makes worse the
C4.5 union option. More information about this experimentation can be found in [14].
Therefore, we can say that in average, classification trees induced with CTC
algorithm have lower error rate than those induced with C4.5, and they are
structurally steadier. As a consequence they provide a wider and steadier explanation,
that allows to deal with the problem of the excessive sensitivity classification trees
have to resampling methods.
5 Analysis of Convergence
We have observed that the value of Common for CT trees increases with the number
of used subsamples. This means that the CT trees tend to have a larger common
structure when Number_Samples increases. This is a desirable behaviour but it could
be due to the higher complexity of the trees (this was the case of C4.5 union in previous
section). In order to take into account the parsimony principle we have normalised the
Common value in respect to the trees' size (number of internal nodes). We will
denominate this measure %Common and it will quantify the identical fraction of two
or more trees.
The information in Fig 2. belongs to one run of the 10 fold cross-validation for
Breast-W database. The curves represent the values of %Common in each one of the
folds when the Number_Samples parameter varies. We will give some clues for better
understanding the figure: obtaining a value of 100% for % Common in a set of trees
means that all the compared trees are equal; obtaining a value of 90% means that in
average the compared trees have 90% of the structure identical.
Each line in Fig. 2 represents for CTC algorithm (left side) and C4.5 algorithm
(right side), the evolution of % Common when the number of samples used to build the
trees increases in one fold. The number of trees compared in each fold varies with
Number_Samples parameter. For N_S = 5, 20 trees are compared in each fold and it
Search WWH ::




Custom Search