Databases Reference
In-Depth Information
Lemma 6 tells us that if a 1-complete region
C
1
in subspace
S
⊕
a
i
does
not have enough weighted density, we can directly jump to test
S
a
j
for
a
j
<
L
a
i
because anything in between must not meet the minimum weighted
density threshold, which leads to Theorem 1.
⊕
Theorem 1.
The lectical smallest closed subspace larger than a given subspace
S
⊂A
and having weighted density larger than δ is S
⊕
a
i
, where a
i
is the
lexicographically largest attribute which satisfies dens
w
(
S
⊕
a
i
)
>δ and S
i
S
⊕
a
i
.
Rationale
.Let
S
⊕
a
j
be the lectically smallest closed subspace that is larger
than
S
.If
dens
w
(
S
⊕
a
j
)
>δ
, the theorem is true since it is the same case as in
Lemma 3. If
dens
w
(
S
⊕
a
j
)
<δ
,let
a
i
be the largest attribute for which
a
i
<
L
a
j
and
S
i
S
⊕
a
i
hold. So we need to show that
S
⊕
a
j
⊕
a
i
is the lectically
smallest closed subspace that is larger than
S
⊕
a
j
, and potentially could
have enough weighted density. Since
dens
w
(
S
a
j
)
<δ
, Lemma 6 guarantees
the search to start with
a
j−
1
for the smallest weighted dense cluster. Since
S
⊕
j
S
⊕
a
j
,
S
∩{
a
1
,...,a
j−
1
}
=
S
⊕
a
j
∩{
a
1
,...,a
j−
1
}
. So the search for
the next
a
i
performs the same on
S
and
S
⊕
a
j
, that is,
S
⊕
a
i
=
S
⊕
a
j
⊕
a
i
.
So
S
a
i
is the lectically smallest closed subspace that is larger than
S
and
could have enough weighted density. If
dens
(
S
⊕
⊕
a
i
)
>δ
, this theorem is true.
Otherwise, find the next
a
k
<
L
a
i
for which
S
k
S
⊕
a
k
, and the proof can
be completed inductively.
3.3 Lectical Weighted Dense Region Mining Algorithm
Theorem 1 states that if we find that a subspace
S
a
i
is not weighted dense,
we can prune the search space by skipping all
a
j
>
L
a
i
, and check directly on
a
i−
1
in the next iteration of the algorithm. Algorithm 1 is a straightforward
implementation of this idea. Based on the correctness of Theorem 1, we can
conclude the correctness of Theorem 2.
⊕
Algorithm 1
Lectical weighted dense region mining algorithm
1.
C
=
<O, S> ← <ψ
(
φ
)
,ϕ◦ ψ
(
φ
)
>
2.
IF
(
dens
w
(
C
)
>δ
)
3. Add
C
=
<O, S>
to Tree
4.
found← true
5.
END IF
6.
REPEAT
7. (
C
,
found
)
←
findnext(
C
)
8.
UNTIL
found
=
false
Theorem 2.
Algorithm 1 finds all maximal
1
-complete regions that satisfy
the minimum weighted density threshold δ.