Information Technology Reference
In-Depth Information
as a measure of the strength of relations between the attributes and the class 2 .
There is an order for the construction of the attribute trees. This order is guided
by the Mutual Information between the attributes and the class. The method
orders the attributes from those with low mutual information to those with high
mutual information. It constructs attribute trees according to this order. These
trees are used to determine unknown values for each attribute. The first at-
tribute tree constructed is a one-node tree with the most frequent value among
the values of the attribute. An attribute tree is constructed for an attribute
A i using a training subset, which contains instances with known values for the
attribute A i , and the attributes whose missing values have already been filled
before. Consequently, the attributes A k for which MI(A i ,C)< MI(A k ,C) are
excluded [16]. During the calculation of MI(A i ,C) , instances which have missing
values for the attribute A i are ignored [18]. This method is not general enough
to be applicable to every domain [18]. The domains in which there are strong
relations between the attributes appear to be the most suitable to apply the
OAT method. In this method, the idea to start by dealing with the attribute
which is the less dependent on the class [16, 17, 18] is interesting, because it is
the attribute which has the least influence on the class.
4.1.3
C4.5's Method
Quinlan's method [5] assigns probability distributions to each node of the deci-
sion tree when learning from training instances. The probability distribution, for
the values of the attribute involved in the test in a node, is estimated from the
relative frequencies of the attribute values among the training instances collected
at that node. The result of the classification is a class distribution instead of a
single class. This approach works well when most of the attributes are indepen-
dent, because it depends only on the prior distribution of the attribute values
for each attribute being tested in a node of the tree [5, 23].
4.1.4
Conclusion
We observe that the methods above have some drawbacks. For example, [5,
13, 22] determine the missing attribute values only once for each object with
this unknown attribute. The Dynamic path generation method and the lazy
2 Mutual Information (MI) between two categorical random variables X and Y is the
average reduction in uncertainty about X that results from learning the value of Y:
P ( x ) log 2 P ( x )+
y
P ( y )
x
MI ( X, Y )=
P ( x|y ) log 2 P ( x|y )
x
D x
D y
D x
D x and D y are the domains of the categorical random variables X and Y. P(x) and
P(y) are the probability of occurrence of x ∈ D x and y ∈ D y ,respectively. P(x|y)
is the conditional probability of X having the value x once Y is known to have the
value y.
Search WWH ::




Custom Search