Information Technology Reference
In-Depth Information
following the branches according to attributes values of the instance. When we
encounter a missing value for a test-attribute (test-node), we must trace all the
paths corresponding to the values of this attribute. In this case, we reach several
leaves in the tree, and not only one leaf as in classical classification. For this
purpose, it is necessary to calculate the class probability on each one of these
leaves.
Let us assume that the class has two values
A
,
D
, and for a path from the
root of the tree to a leaf
F
, we go through the branches
B
1
,
B
2
,..,
B
n
.
P(class A at leaf F) = P(A
|
path from the root to F) =
P
(
A|B
1
,B
2
, .., B
n
)
P(class D at leaf F) = P(D
|
path from the root to F) =
P
(
D
|
B
1
,B
2
, .., Bn
)
P
(
A in the tree
)=
i
P
(
A
|
F
i
)
∗
P
(
F
i
)
P
(
D in the tree
)=
i
P
(
D
|
F
i
)
∗
P
(
F
i
)
where i = 1,..,m (m is the number of leaves in the tree).
The probability
P
(
A
F
i
) is the conditional probability of class
A
at this leaf;
the probability
P
(
F
i
) is the joint probability of the attributes in the path which
starts from the root until the leaf
F
i
.
To simplify, let us consider that the path from the root of the tree until
F
i
goes through only the branches
B
1
and
B
2
:
P
(
F
i
)=
P
(
B
1
,B
2
)=
P
(
B
1
)
|
∗
P
(
B
2
|
B
1
);
B
1
is less dependent on the class
than
B
2
4
.
4.3.1
Calculating the Joint Probability
P
(
B
1
,B
2
)UsingOur
Approach
To calculate this joint probability, we distinguish the following cases:
•
B
1
and
B
2
are independent:
P
(
B
2
|
P
(
B
2
)
Consequently, the
PAT
of
B
1
is constructed without
B
2
and the
PAT
of
B
2
is constructed without
B
1
. We calculate the probability of the attribute
B
1
from its
PAT
. The probability of
B
2
is also calculated from its
PAT
.
B
1
)=
P
(
B
2
)and
P
(
B
1
,B
2
)=
P
(
B
1
)
∗
•
B
1
and
B
2
are dependent and the
POAT
of
B
1
is constructed without
B
2
because
B
1
is less dependent on the class than
B
2
:
P
(
B
1
|
=
P
(
B
1
). The
probability of
B
1
is calculated from its
POAT
. Note that the
PAT
of
B
2
is
constructed using
B
1
. Therefore, we calculate the conditional probability of
B
2
given
B
1
P
(
B
2
|
B
2
)
B
1
)fromthe
PAT
of
B
2
.
4
In our work, when two attributes are dependent and unknown at the same time
(Cycle problem)
, we deal first with the attribute which is less dependent on the class
by using its
POAT
. Then, for the other attribute, we use its
PAT
.