MLEM2 Rule Induction Algorithms: With and Without Merging Intervals - Data Mining: Foundations and Practice

Databases Reference

In-Depth Information

5 Experiments

In our research we used the same data sets that were used for experiments

in [9]. All of these data set have numerical attributes and are completely spec-

ified (i.e., for every attribute and every case the corresponding attribute value

is specified). These ten data sets are presented in Table 2. For experiments

we used three different approaches: the original LEM2 algorithm with dis-

cretization based on entropy as preprocessing, and two versions of MLEM2

algorithms. The first version of MLEM2 was not equipped with a mechanism

for merging intervals within the same rule. For example, from pima data set,

a typical induced rule was:

6, 38, 38

(Diabetes, 0.078..0.2995) & (Pressure, 57..122) &

(Diabetes, 0.1655..2.42) & (Age, 21..38.5) &

(Pressure, 0..83) & (Glucose, 0..99.5) - > (Class, 0)

It is clear that two conditions, both associated with the same attribute

Diabetes , namely:

(Diabetes, 0.078..0.2995) and (Diabetes, 0.1655..2.42)

can be merged into one condition:

(Diabetes, 0.1655..0.2995).

Similarly, for attribute Pressure , the following two conditions

(Pressure, 57..122) and (Pressure, 0..83)

can be also merged into one condition:

Pressure, 57..83).

The third way to induce rules was the newest version of the MLEM2 al-

gorithm that is able to merge conditions with intervals. We used results of

Tabl e 2 . Data sets

Data set

Number of

Cases

Attributes

Concepts

Bank

66

5

2

Bricks

216

10

2

Bupa

345

6

2

Buses

76

8

2

German

1 , 000

24

2

Glass

214

9

6

HSV

122

11

2

Iris

150

4

3

Pima

768

8

2

Segmentation

210

19

7

Data Mining: Foundations and Practice

Search WWH ::

Custom Search

Home