MLEM2 Rule Induction Algorithms: With and Without Merging Intervals - Data Mining: Foundations and Practice

Databases Reference

In-Depth Information

where p j is defined as a relative frequency of a concept C j , equal to

|

S

∩

C j |

.

|

S

|

For given attribute a ,thecutpoint q for which E ( a,U,q ) is minimal is the

best cutpoint. In order to induce k intervals the above procedure is applied

recursively k

1 times. After determining the first cutpoint q that defines a

partition of U into two sets S 1 and S 2 , we compute E ( a,S 1 ,q 1 )and E ( a,S 2 ,q 2 )

for two candidate cutpoints q 1 and q 2 for S 1 and S 2 ), respectively. Among q 1

and q 2 , we select the cutpoint with the larger entropy. Thus, the worse of sets

S 1 and S 2 is partitioned.

The discretization methods presented here can be classified as either local

or global [2]. Local methods are characterized by operating on only one at-

tribute, while global methods are characterized by considering all attributes

(rather than one) before making a decision where to induce interval cutpoints.

In global discretization, first we select the best attribute and then, for the

selected attribute, we select the best cutpoint. In our approach, see [2], the

best attribute was selected on the basis of the following measure

M {A D } ∗ = S ∈{ A D } ∗

−

| S |

|U |

E ( S )

|{

A D

} ∗ |

} ∗ is the partition induced by the discretized attribute A D .A

candidate attribute for which

A D

where

{

M {A D } ∗ is maximum is selected as for re-

discretization. Obviously, we need only to re-compute this measure for an

attribute which was last picked for re-discretization.

3MLEM2

In general, LERS uses two different approaches to rule induction: one is used in

machine learning, the other in knowledge acquisition. In machine learning, or

more specifically, in learning from cases (examples), the usual task is to learn

the smallest set of minimal rules, describing the concept. To accomplish this

goal, LERS uses two algorithms: LEM1 and LEM2 (LEM1 and LEM2 stand

for Learning from Examples Module, version 1 and 2, respectively) [4, 5].

Let B be a nonempty lower or upper approximation of a concept repre-

sented by a decision-value pair ( d,w ). Set B depends on a set T of attribute-

value pairs t =( a,v ) if and only if

=[ T ]=

t∈T

∅

[ t ]

⊆

B.

where [( a,v )] denotes the set of all examples such that for attribute a its

values are v .

Data Mining: Foundations and Practice

Search WWH ::

Custom Search

Home