Dealing with Missing Values in a Probabilistic Decision Tree during Classification - Mining Complex Data

Information Technology Reference

In-Depth Information

decision tree method [7, 15] do not resolve the missing values problem during

the construction of the tree, but they classify the object using only its known

attribute. Shapiro's method [21] makes good use of all the information available

from the class and all the other attributes, but there is a diculty that arises

if the same case has missing values on more than one attribute [15]: during the

construction of a tree to predict an unknown attribute, if a missing value is tested

for another attribute, another tree must be constructed to predict this attribute,

and so on. This method cannot be used practically, because this recursion process

of constructing a decision tree once we find missing values for an attribute, leads

to eliminating too many training cases when there are many unknown attributes.

By constructing the attribute trees according to an order relying only on mutual

information between the attributes and the class, Lobo and Numao provide a

solution which can work in every situation [18]. However, they do not take into

account all the dependencies between attributes, because they are built in an

ordered manner. Therefore, It seems to make sense to build an attribute tree

from the attributes which are dependent on it.

4.2

Probabilistic Approach

Our approach to estimate missing values during classification uses a decision tree

to predict the value of an unknown attribute from its dependent attributes [8].

This value is represented by a probability distribution. We made two proposals.

The first one, called Probabilistic Ordered Attribute Trees (POATs) ,simplyex-

tends Lobo's OATs [16] with probabilistic data. In this proposal, we construct

a probabilistic attribute tree for each attribute in the training data. These trees

are constructed according to an order guided by the Mutual Information be-

tween the attributes and the class. The attributes used to build a POAT for an

attribute A i are those whose attributes trees have already been built before and

are dependent on A i . The result of classifying an object with missing values using

POAT is a class distribution instead of a single class. These trees give a proba-

bilistic result which is more refined than Lobo's initial OATs . However, they do

not take into account all the dependencies between attributes, because they are

built in the same ordered manner that is used by Lobo's OAT . Therefore, we

suggested another approach, called Probabilistic Attribute Trees (PATs) ,which

uses the dependence between attributes and also gives a probabilistic result [8].

In the PATs approach, we calculate the Mutual Information between each pair

of attributes in order to determine for each attribute its dependent attributes.

A Probabilistic Attribute Tree (PAT) is constructed for each attribute, using all

the attributes depending on it.

4.3

Classification Algorithm

To classify an instance with missing values using the final probabilistic decision

tree 3 , we start tracing the decision tree from its root until we reach a leaf by

3 A final decision tree is the tree which corresponds to all the training set.

Mining Complex Data

Search WWH ::

Custom Search

Home