Dealing with Missing Values in a Probabilistic Decision Tree during Classification - Mining Complex Data

Information Technology Reference

In-Depth Information

each tree which is called, we trace only one path. At the end of our classification

process, we have called several attribute trees . We can calculate the complexity

of our classification algorithm as follows: Let us assume that at the end of our

classification, we have called nbAT attribute trees . Therefore, the complexity of

these nbAT trees is:

i = nbAT

O (

ComplexityTree ( Tree i ))

(4.7)

i =1

ComplexityTree ( Tree )= O (0) if one leaf/ L Tree i =1 /

O ( log ( L Tree i )) else

(4.8)

where L Tree i is the number of leaves in this tree.

To calculate nbAT which is the number of attribute trees called during the

classification process, let us assume that in our test instance, we have m missing

attributes; attribute i takes v i values. If all of these attributes are dependent,

we can calculate the number of attribute trees called during our classification

process as follows:

nbAT = m

( v 1 ×

v 2 ×

...

v m )

v 1 + v 2 + ... + v m

Assuming that v =

m is the average number of possible values for

each attribute. The number of called trees becomes:

v m

nbAT = m

(4.9)

From equations 4.7, 4.8 and 4.9, we calculate the complexity of all the attribute

trees called during the classification as:

i = nbAT

O (

ComplexityTree ( Tree i )) = O (

log ( L Tree i ))

i =1

i = nbAT

L T ))

= O ( log (

L Tree i )) = O ( log (

i =1

= O ( log ( L nbA T )) = O ( nbAT × log ( L T ))

= O ( m × v m log ( L T ))

(4.10)

Where :

- L T is the average number of leaves in an attribute tree .

- m is the number of missing attributes in the instance.

We remark that the complexity in equation 4.10 increases when the number

of missing values in the instance increases.

Finally, the complexity of classifying an instance in the final decision tree

using our approach is equal to the complexity of tracing this decision tree + the

complexities of all the attribute trees called during the classification process:

Mining Complex Data

Search WWH ::

Custom Search

Home