Environmental Engineering Reference
In-Depth Information
Quinlan (1986) uses information gain, which is the expected reduction in entropy
(uncertainty) of the class value resulting from knowing the value of the given
attribute and the outcome of the test. Other attribute selection measures, such as
the Gini index, a measure of the statistical dispersion of the target variable (Breiman
et al. 1984), can and have been used in classification tree induction. In regression tree
induction, the expected reduction in the variance (also a measure of statistical
dispersion, but for continuous targets) of the class value can be used.
Multi-target trees are constructed with the same recursive partitioning algorithm
as single-target trees. The key difference is in the test selection procedure. For
classification, the heuristic impurity function used for selecting the attribute tests
(that define the internal nodes) is defined as N P
T
Var y½
with N the number of
1
examples in the node, T the number of target variables, and Var [ y t ]
Entropy[y t ]
the entropy of target variable y t in the node. For regression, the sum of variance
reductions along each of the targets is used to select tests.
Multi-target trees are an instantiation of the predictive clustering trees (PCTs)
framework (Blockeel et al. 1998). In this framework, a tree is viewed as a hierarchy
of clusters: a node corresponds to a cluster. PCTs have been used to handle different
types of targets: multiple target variables, both discrete and continuous (Struyf and
Dˇeroski 2006; Debeljak et al. 2009), time series (Dˇeroski et al. 2007) and
hierarchies of classes, with multiple class-labels per example (Vens et al. 2008).
An important mechanism used to improve decision tree performance is tree
pruning. Pruning reduces the size of a decision tree by removing sections of the tree
(sub-trees) that are unreliable and do not contribute to the predictive performance of
the tree. When a sub-tree rooted in a certain node of the tree is pruned, it is removed
from the tree and the node is replaced by a leaf. The dual goal of pruning is to
reduce the complexity of the final tree as well as to achieve better predictive
accuracy by the reduction of over-fitting and removal of sections of the tree that
may be based on noisy or erroneous data.
There are two major approaches to decision tree pruning. Pruning can be
employed during tree construction (pre-pruning) or after the tree has been con-
structed (post-pruning). Typically, a minimum number of examples in branches can
be prescribed for pre-pruning and a confidence level in accuracy estimates for
leaves for post-pruning.
¼
14.3.3 Systems for Building Decision Trees
The CART (Classification And Regression Trees) system (Breiman et al. 1984)
was the first widely known and used system for learning decision trees. It has
been surpassed in popularity only by the C4.5 system for learning classification
trees (Quinlan 1986), succeeded by C5.0 (RuleQuest 2009). Nowadays, the most
Search WWH ::




Custom Search