Database Reference
In-Depth Information
Initial distribution: (6,6)
New distribution: (8,8)
Split based on X
Split based on X
(1,4)
(4,1)
(1,1)
(3,4)
(4,3)
(1,1)
Split based on Y
Split based on Y
(2,4)
(4,2)
(2,6)
(6,2)
a) Before edition of the training set
b)After edition of the training set
Fig. 4. Effects on split variable
+
+
+
+
++
++
+
+
+
+
+
+
Instance misclassified according to KNN
New decision border
Duplicated instance
(b) After training set editing
(a) Before training set editing
Fig. 5. Effects on pruning
belonging to A,#instances belonging to B). In the left side it is shown the original
training set, along with the partitions induced by the variables X and Y. The in-
formation gain if X is chosen is (1
0
.
7683) = 0
.
2317, and if Y is chosen instead is
(1
0817. So, X would be chosen as variable to split. After the training
set edition, as showed in the right side of the figure, four instances are duplicated,
two of them belonging to class A, and the remaining two to class B. Now, the in-
formation gain if X is chosen is (1
0
.
9183) = 0
.
.
.
0
9871) = 0
0129, and if Y is chosen instead is
.
.
(1
0
8113) = 0
1887. Variable Y would be chosen, leading to a different tree.
4.2
Change in the Pruning Decision
In figure 5 is shown an example where a change in the pruning decision could
be taken into account. In the left subfigure, before the edition of the training set
with duplication of cases misclasified by k -NN, the density of examples belonging
Search WWH ::




Custom Search