Dealing with Missing Values in a Probabilistic Decision Tree during Classification - Mining Complex Data

Information Technology Reference

In-Depth Information

In the POAT approach, we construct a POAT for each attribute in the training

data. Therefore, for a training set which has m attributes without the class

attribute and n instances, we construct m decision trees. The first tree is a one-

leaf tree constructed using one attribute. The second tree uses two attributes,

and so on; the last tree is constructed using m attributes 11 . Since we obtain m

decision trees with an increasing number of attributes, the complexity of these

trees is:

O (1 nlog ( n )+2 nlog ( n ) + ..+ mnlog ( n ))= O (( i =1 i ) nlog ( n )) =

O ( m ( m +1)

2

nlog ( n ))

For the same training set, we construct another m decision trees using the PAT

approach. Each tree is constructed for an attribute by using all the attributes

dependent on it. In the worst-case, all the attributes are mutually dependent.

In this case, each decision tree is constructed using m attributes. Therefore, the

complexity of these m trees is:

O ( mnlog ( n )+ mnlog ( n )+...+ mnlog ( n ))= O ( mmnlog ( n )) =

O ( m 2 nlog ( n )).

4.6.1

The Complexity of Classifying a New Instance

Once a decision tree has been built, classifying a new instance is extremely fast,

with a worst-case complexity of O(h), where h is the maximum depth of the tree

[6]. This is true when we classify a new instance without missing values, because

we trace only one path in the decision tree from its root-node to a leaf-node

according to the outcome of each attribute node in this path. Let us assume

that L is the number of leaves in the tree. Therefore, the height of this tree is

superior or equal to log v ( L )[10].

The complexity of classifying a new case is O ( log v ( L )). This complexity

changes when the new instance has missing values for some attributes, because

we trace several paths in the tree instead of only one path. In the worst-case, we

trace all the possible paths. The complexity of tracing all the possible paths in

the tree becomes O ( Llog v ( L )), which is the classification complexity of C4.5.

The complexity of our classification algorithm

In our method, during the classification process, when we encounter a missing

attribute-test we trace all the paths corresponding to its values. When we reach

a leaf in the final tree, we calculate the probability of each missing attribute

encountered in such a path by calling its tree 12 . The way in which we calculate

the probability of each missing attribute is not arbitrary. We always start by

dealing with the attribute which is the less dependent on the class. Therefore, in

11 In the worst-case, when all the m-1 attributes are dependent on the attribute m .

12 POAT or PAT according to the situation. We take into account the other missing

attributes encountered in such a tree.

Search WWH ::

Custom Search

Home