Database Reference
In-Depth Information
highest average distance to the other rules are considered to be most
interesting [ Gago and Bentos, 1998 ] .
4.9 Overfitting and Underfitting
The concept of overfitting is very important in data mining. It refers to
the situation in which the induction algorithm generates a classifier which
perfectly fits the training data but has lost the capability of generalizing to
instances not presented during training. In other words, instead of learning,
the classifier just memorizes the training instances. Overfitting is generally
recognized to be a violation of the principle of Occams razor presented in
Section 4.4.
In decision trees, overfitting usually occurs when the tree has too many
nodes relative to the amount of training data available. By increasing the
number of nodes, the training error usually decreases while at some point
the generalization error becomes worse.
Figure 4.8 illustrates the overfitting process. The figure presents the
training error and the generalization error of a decision tree as a function
of the number of nodes for a fixed training set. The training error continues
to decline as the tree become bigger. On the other hand, the generalization
error declines at first then at some point starts to increase due to overfitting.
The optimal point for the generalization error is obtained for a tree with
130 nodes. In bigger trees, the classifier is overfitted. In smaller trees, the
classifier is underfitted.
30
25
20
15
10
Train Error
Generalization Error
5
0
0
100
200
300
400
Tree Size (Number of Nodes)
Fig. 4.8
Overfitting in decision trees.
Search WWH ::




Custom Search