Information Technology Reference
In-Depth Information
following the branches according to attributes values of the instance. When we
encounter a missing value for a test-attribute (test-node), we must trace all the
paths corresponding to the values of this attribute. In this case, we reach several
leaves in the tree, and not only one leaf as in classical classification. For this
purpose, it is necessary to calculate the class probability on each one of these
leaves.
Let us assume that the class has two values A , D , and for a path from the
root of the tree to a leaf F , we go through the branches B 1 , B 2 ,.., B n .
P(class A at leaf F) = P(A | path from the root to F) = P ( A|B 1 ,B 2 , .., B n )
P(class D at leaf F) = P(D
|
path from the root to F) = P ( D
|
B 1 ,B 2 , .., Bn )
P ( A in the tree )=
i
P ( A
|
F i )
P ( F i )
P ( D in the tree )=
i
P ( D
|
F i )
P ( F i )
where i = 1,..,m (m is the number of leaves in the tree).
The probability P ( A
F i ) is the conditional probability of class A at this leaf;
the probability P ( F i ) is the joint probability of the attributes in the path which
starts from the root until the leaf F i .
To simplify, let us consider that the path from the root of the tree until F i
goes through only the branches B 1 and B 2 :
P ( F i )= P ( B 1 ,B 2 )= P ( B 1 )
|
P ( B 2 |
B 1 ); B 1 is less dependent on the class
than B 2 4 .
4.3.1
Calculating the Joint Probability P ( B 1 ,B 2 )UsingOur
Approach
To calculate this joint probability, we distinguish the following cases:
B 1 and B 2 are independent:
P ( B 2 |
P ( B 2 )
Consequently, the PAT of B 1 is constructed without B 2 and the PAT of B 2
is constructed without B 1 . We calculate the probability of the attribute B 1
from its PAT . The probability of B 2 is also calculated from its PAT .
B 1 )= P ( B 2 )and P ( B 1 ,B 2 )= P ( B 1 )
B 1 and B 2 are dependent and the POAT of B 1 is constructed without B 2
because B 1 is less dependent on the class than B 2 : P ( B 1 |
= P ( B 1 ). The
probability of B 1 is calculated from its POAT . Note that the PAT of B 2 is
constructed using B 1 . Therefore, we calculate the conditional probability of
B 2 given B 1 P ( B 2 |
B 2 )
B 1 )fromthe PAT of B 2 .
4 In our work, when two attributes are dependent and unknown at the same time
(Cycle problem) , we deal first with the attribute which is less dependent on the class
by using its POAT . Then, for the other attribute, we use its PAT .
 
Search WWH ::




Custom Search