Discovery of Positive and Negative Rules from Medical Databases Based on Rough Sets - Advanced Techniques in Knowledge Discovery and Data Mining

Database Reference

In-Depth Information

9. Discovery of Positive and Negative Rules

241

9.4 Algorithms for Rule Induction

The contrapositive of a negative rule, an exclusive rule, is induced as an ex-

clusive rule by the modification of the algorithm introduced in PRIMEROSE-

REX [9.10], as shown in Fig. 9.8. This algorithm works as follows. (1) First

it selects a descriptor [a i = v j ] from the list of attribute-value pairs, denoted

by L. (2) Then it checks whether this descriptor overlaps with a set of posi-

tive examples, denoted by D. (3) If so, this descriptor is included in a list of

candidates for positive rules and the algorithm checks whether its coverage

is equal to 1.0. If the coverage is equal to 1.0, then this descriptor is added

to R e r, the formula for the conditional part of the exclusive rule of D.(4)

Then [a i = v j ] is deleted from the list L. This procedure, from (1) to (4),

will continue unless L is empty. (5) Finally, when L is empty, this algorithm

generates negative rules by taking the contrapositive of induced exclusive

rules.

On the other hand, positive rules are induced as inclusive rules by the

algorithm introduced in PRIMEROSE-REX [9.10], as shown in Fig. 9.9. For

induction of positive rules, the threshold of accuracy and coverage is set to

1.0 and 0.0, respectively.

This algorithm works in the following way. (1) First it substitutes L 1 ,

which denotes a list of formulas composed of only one descriptor, with the

list L er generated by the former algorithm shown in Fig. 9.1. (2) Then until

L 1 becomes empty, the following steps will continue: (a) A formula [a i = v j ]

is removed from L 1 . (b) Then the algorithm checks whether α R (D) is larger

than the threshold. (For induction of positive rules, this is equal to checking

whether α R (D) is equal to 1.0.) If so, then this formula is included a list of

the conditional parts of positive rules. Otherwise, it will be included in M,

which is used for making conjunctions. (3) When L 1

is empty, the next list

L 2 is generated from the list M.

9.5 Experimental Results

For experimental evaluation, a new system, called PRIMEROSE-REX2(Prob-

abilistic Rule Induction Method for Rules of Expert System ver. 2.0), was

developed, where the algorithms discussed in Section 9.4 were implemented.

PRIMEROSE-REX2 was applied to the following three medical domains:

(1) headache (RHINOS domain), whose training samples consist of 52,119

samples, 45 classes, and 147 attributes; (2) cerebulovasular diseases (CVD),

whose training samples consist of 7620 samples, 22 classes, and 285 attributes;

and (3) meningitis, whose training samples consist of 1211 samples, 4 classes,

and 41 attributes (Table 9.2).

For evaluation, we used the following two types of experiments. One ex-

periment was to evaluate the predictive accuracy using the cross-validation

method, which is often used in the machine-learning literature [9.9]. The

Advanced Techniques in Knowledge Discovery and Data Mining

Search WWH ::

Custom Search

Home