Mining Efficiently Significant Classification Association Rules - Data Mining: Foundations and Practice

Databases Reference

In-Depth Information

of subsets of [2 n ] means first to arrange the sets of

To apply a family

F

,F |F| .Inthe i th step, each element of [2 n ] will be

contained in F i with a uniformed probability 1 /k and only the CARs in

into a sequence F 1 ,F 2 ,

···

with

labels in F i will be involved in the procedure SIGNIFICANCE-TEST, while

other CARs can be ignored. Thus, we have an O ( k )-complexity to hit each of

the k potential significant rules independently once with a high probability,

due to the property of the randomised k -selector (see Theorem 1). With a list

of the extracted O ( k ) rules (all significant rules and some insignificant rules),

the weighting score of each rule for A is computed. The “best k ” (top- k score)

rules are the significant rules for class A identified in

R

.

4.3 Rule Mining Algorithm

The function (Fig. 2) identifies a rule with a possible large score for class A .

Lemma 1. If there exists only one potential significant rule R j in the cur-

rent test F i and R j is also a significant rule, the function

LOG-TEST

will

return R j .

Proof. According to Definition 9, we know that c A ( R j ) > |I ( A ) |

h =1

c A ( Item h

I ( A )). Thus, the subset of

∈

which contains R j is always chosen for further

binary test if R j is the only potential significant rule including in the current

test F i .

The procedure (Fig. 3) identifies all k significant rules for class A in

R

.

will catch all k signif-

Theorem 2. The procedure

SIGNIFICANCE-TEST

icant rules.

Function LOG-TEST( F i , R );

input: F i (the i th element in F ) and set R ;

output: Rw (a rule in R );

(1)begin

(2)

Rw := null ;

(3)

Temp := F i ;

(4)

while |Temp| > 1do

(5)

choose an arbitrary subset Temp 0 with half CARs in Temp to test;

if ∀Item h ∈Temp 0 c

A ( Item h ) ≥ ∀Item h ∈Temp−Temp 0 c

A ( Item h )

(6)

(7) then Temp← Temp 0 ;

(8) else Temp← Temp− Temp 0 ;

(9) end while

(10) Rw ← Temp ;

(11) return ( Rw );

(12)end

Fig. 2. The LOG-TEST function

Data Mining: Foundations and Practice

Search WWH ::

Custom Search

Home