An Algorithm for Mining Weighted Dense Maximal 1-Complete Regions - Data Mining: Foundations and Practice

Databases Reference

In-Depth Information

Lemma 6 tells us that if a 1-complete region C 1 in subspace S

⊕

a i does

not have enough weighted density, we can directly jump to test S

a j for

a j < L a i because anything in between must not meet the minimum weighted

density threshold, which leads to Theorem 1.

⊕

Theorem 1. The lectical smallest closed subspace larger than a given subspace

⊂A

and having weighted density larger than δ is S

⊕

a i , where a i is the

lexicographically largest attribute which satisfies dens w ( S

⊕

a i ) >δ and S

⊕

a i .

Rationale .Let S

⊕

a j be the lectically smallest closed subspace that is larger

than S .If dens w ( S

⊕

a j ) >δ , the theorem is true since it is the same case as in

Lemma 3. If dens w ( S

⊕

a j ) <δ ,let a i be the largest attribute for which a i < L a j

and S

i S

⊕

a i hold. So we need to show that S

⊕

a j ⊕

a i is the lectically

smallest closed subspace that is larger than S

⊕

a j , and potentially could

have enough weighted density. Since dens w ( S

a j ) <δ , Lemma 6 guarantees

the search to start with a j− 1 for the smallest weighted dense cluster. Since

⊕

j S

⊕

a j , S

∩{

a 1 ,...,a j− 1 }

= S

⊕

a j ∩{

a 1 ,...,a j− 1 }

. So the search for

the next a i performs the same on S and S

⊕

a j , that is, S

⊕

a i = S

⊕

a j ⊕

a i .

So S

a i is the lectically smallest closed subspace that is larger than S and

could have enough weighted density. If dens ( S

⊕

a i ) >δ , this theorem is true.

Otherwise, find the next a k < L a i for which S

k S

⊕

a k , and the proof can

be completed inductively.

3.3 Lectical Weighted Dense Region Mining Algorithm

Theorem 1 states that if we find that a subspace S

a i is not weighted dense,

we can prune the search space by skipping all a j > L a i , and check directly on

a i− 1 in the next iteration of the algorithm. Algorithm 1 is a straightforward

implementation of this idea. Based on the correctness of Theorem 1, we can

conclude the correctness of Theorem 2.

⊕

Algorithm 1 Lectical weighted dense region mining algorithm

1. C = <O, S> ← <ψ ( φ ) ,ϕ◦ ψ ( φ ) >

2. IF ( dens w ( C ) >δ )

3. Add C = <O, S> to Tree

4. found← true

5. END IF

6. REPEAT

7. ( C , found ) ← findnext( C )

8. UNTIL found = false

Theorem 2. Algorithm 1 finds all maximal 1 -complete regions that satisfy

the minimum weighted density threshold δ.

Data Mining: Foundations and Practice

Search WWH ::

Custom Search

Home