Information Technology Reference
In-Depth Information
rule representation. In addition, C4.5 solve data learning problem of continuous
value.
7.7.5 Dispersing continuous attribute
Decision tree is mainly used to learn the approach that take discrete variable as
attribute type. To learn continuous variable, it must be dispersed. However, in
some algorithm (such as C4.5), it is easier to select the dispersed continuous
attribute than to select the discrete attribute. In these algorithms, for a continuous
attribute, firstly order different values of stored training examples, then select
each pair of neighbor central points as standard to differential attribute values.
Since these dispersed attribute only represented by one example, continuous
attribute will be selected in priority.
Dougherty employs an information entropy based procedure to disperse
continuous attribute. They disperse continuous global variable before generating
the decision tree, rather than disperse local variable based on examples of a node
like C4.5. Since local data is less, it is easily influent by noise of the data. Based
on information entropy, this approach recursively divides continuous value into
many discrete attributes. In addition, it employ MDL standard. They find this
approach do not drop the classification accuracy when used in C4.5. On the
contrary, it sometimes raises the accuracy and decreases the size of trees in C4.5.
Auer proposed an approach to disperse continuous variable using local method.
His T2 algorithm divides continuous variable into many discrete variables but not
binary continuous variable. It is not doing recursive divide as above, but doing
complete search to find a group of intervals so that error in training example set
is minimum. Default value of m is
is the number of partition
classes. Therefore, complexity of T2 is proportional to
C
+1, where
C
6
2
C
f
.
f
is number of
attribute.
In C4.5 Release8 (R8), Quinlan proposed a local, MDL based approach to
penalize continuous attribute if it has too more values.
Experiment data shows that, C4.5 R8 approach has better performance in
most situations, but Dougherty approach of global discretization has better
performance in small data set. T2 works well when data is partitioned into less
classes.
Search WWH ::




Custom Search