Graphics Reference
In-Depth Information
•
Regarding the number of intervals, the discretizers which divide the numerical
attributes in fewer intervals are
Heter-Disc
,
MVD
and
Distance
, whereas dis-
cretizers which require a large number of cut points are
HDD
,
ID3
and
Bayesian
.
The Wilcoxon test confirms that
Heter-Disc
is the discretizer that obtains the least
intervals outperforming the rest.
•
The inconsistency rate both in training data and test data follows a similar trend for
all discretizers, considering that the inconsistency obtained in test data is always
lower than in training data.
ID3
is the discretizer that obtains the lowest average
inconsistency rate in training and test data, albeit the Wilcoxon test cannot find
significant differences between it and the other two discretizers:
FFD
and
PKID
.
We can observe a close relationship between the number of intervals produced and
the inconsistency rate, where discretizers that compute fewer cut points are usually
those which have a high inconsistency rate. They risk the consistency of the data
in order to simplify the result, although the consistency is not usually correlated
with the accuracy, as we will see below.
•
In decision trees (
C4.5
and
PUBLIC
), a subset of discretizers can be stressed as
the best performing ones. Considering average accuracy,
FUSINTER
,
ChiMerge
and
CAIM
stand out from the rest. Considering average kappa,
Zeta
and
MDLP
are also added to this subset. The Wilcoxon test confirms this result and adds
another discretizer,
Distance
, which outperforms 16 of the 29 methods. All meth-
ods emphasized are supervised, incremental (except
Zeta
) and use statistical and
information measures as evaluators. Splitting/Merging and Local/Global proper-
ties have no effect on decision trees.
•
Considering rule induction (
DataSqueezer
and
Ripper
), the best performing dis-
cretizers are
Distance
,
Modified Chi2
,
Chi2
,
PKID
and
MODL
in average accuracy
and
CACC
,
Ameva
,
CAIM
and
FUSINTER
in average kappa. In this case, the results
are very irregular due to the fact that the Wilcoxon test emphasizes the
ChiMerge
as the best performing discretizer for
DataSqueezer
instead of
Distance
and incor-
porates
Zeta
in the subset. With
Ripper
, the Wilcoxon test confirms the results
obtained by averaging accuracy and kappa. It is difficult to discern a common
set of properties that define the best performing discretizers due to the fact that
rule induction methods differ in their operation to a greater extent than decision
trees. However, we can say that, in the subset of best methods, incremental and
supervised discretizers predominate in the statistical evaluation.
•
Lazy and bayesian learning can be analyzed together, due to the fact that the
HVDM distance used in KNN is highly related to the computation of bayesian
probabilities considering attribute independence [
114
]. With respect to lazy and
bayesian learning,
KNN
and
Naïve Bayes
, the subset of remarkable discretizers is
formed by
PKID
,
FFD
,
Modified Chi2
,
FUSINTER
,
ChiMerge
,
CAIM
,
EqualWidth
and
Zeta
, when average accuracy is used; and
Chi2
,
Khiops
,
EqualFrequency
and
MODL
must be added when average kappa is considered. The statistcal report by
Wilcoxon informs us of the existence of two outstanding methods:
PKID
for
KNN
,
which outperforms 27/29 and
FUSINTER
for
Naïve Bayes
. Here, supervised and
unsupervised, direct and incremental, binning and statistical/information evalua-