Information Technology Reference
In-Depth Information
Table 4.13.
Result of testing
Instance Analysis Algorithm
on
Vote
database
attributes
instance 1
instance 2
instance 3
physician-fee
?
?
?
el-salvador
y
?
y
education
y
n
n
crime
y
n
n
near=8
(16%, 83%)(91%, 08%)(92%, 07%)
near=10
(29%, 70%) (84%, 15%) (75%, 24%)
near=12
(38%, 61%) (70%, 29%) (57%, 42%)
PAT
(11%,89%) (99%,01%) (85%,15%)
C4.5
(53%,47%)
(53%,47%)
(53%,47%)
has 16 attributes, so the constant
near
may be 8, 9, 10, 11, or 12. Table 4.13
contains the results of testing this algorithm only when
near
is 8, 10 and 12.
It also contains the results of testing the
PAT
approach and C4.5 on the same
examples. By comparing the results of
PAT
and C4.5 with the statistical results
given by the
Analysis-Instance
algorithm for the same examples, we remark
that
PAT's
results in Table 4.13 are closer to the statistical results in Table 4.13
when
near
is 8. Therefore, they are better than the C4.5 results when
near
is 8,
10 or 12.
We note that the only attribute used to construct the decision tree using
the C4.5 method is
physician-fee-freeze
, which has the greatest influence on the
decision. However, if this attribute is unknown, as in Table 4.13, C4.5 calculates
its frequency in all the training data without taking into account the other
attributes which depend on it. Consequently, in the test data each object which
has a missing value for
physician-fee-freeze
is classified
Democrat
with probability
0.53 and
Republican
with probability 0.47 (Table 4.13). For the threshold 0.4,
our
probabilistic decision tree
corresponding to the training data is constructed
using the attributes
physician-fee-freeze
,
el-salvador-aid
and
education-spending
.
The
PAT
for
physician-fee-freeze
is constructed using
el-salvador-aid
,
education-
spending
and
crime
. Consequently, when
physician-fee-freeze
is unknown, we
calculate its probability according to its dependent attributes, and so on. For
example, in Table 4.13, we notice that in the
PAT
approach, the probability
distribution of each object depends on the other attributes values. However,
with C4.5, this distribution depends only on
physician-fee-freeze
's frequency.
4.6
The Complexity of Constructing a Decision Tree
The complexity of a decision tree depend on the size of the training data and
the number of attributes in this data. For example, if we have training data
that contains
n
objects and
m
attributes without the class, the computational
complexity of the decision tree constructed using this training data is [19]:
O
(
n ×
(
m
+1)
× log
(
n
))
(4.6)