Database Reference
In-Depth Information
Table 14.11 Frequent pattern vs. probabilistic frequent patterns
Let user-specified minsup = 1.0 and minProb = 0.90
sup (
{
d
}
)
Set of items
Prob ( W j )
COUNT( W j )
2
( d :0.9) t 2 ( d :0.5) t 3
0.45
8192
( d :0.9) t 2 ( d :0.5) t 3
0.45
8192
1
( d :0.9) t 2 ( d :0.5) t 3
+ 0 . 05
+ 8192
=
=
0 . 50
16384
t 2
t 3
0
( d :0.9)
( d :0.5)
0.05
8192
j Prob ( W j )
j COUNT ( W j )
=
1
=
32768
As expSup (
{
d
}
, D 2 )
=
(2
×
0 . 45)
+
(1
×
0 . 50)
+
(0
×
0 . 05)
=
0 . 9
+
0 . 5
=
1 . 4
minsup ,
{
d
}
is
an expected support-based frequent pattern .
As Prob ( sup (
{
d
}
, D 2 )
minsup )
=
0 . 45
+
0 . 50
=
0 . 95
minProb ,
{
d
}
is also a probabilistic
frequent pattern .
sup (
{ e }
)
Set of items
Prob ( W j )
COUNT( W j )
2
( e :0.7)
t 3
( e :0.3)
t 4
0.21
8192
( e :0.7)
t 3
( e :0.3)
t 4
0.49
8192
1
( e :0.7)
t 3 ( e :0.3) t 4
+ 0 . 09
+ 8192
= 0 . 58
= 16384
0
( e :0.7) t 3 ( e :0.3) t 4
0.21
8192
j Prob ( W j )
j COUNT ( W j )
32768
As expSup ( { e } , D 2 ) = (2 × 0 . 21) + (1 × 0 . 58) + (0 × 0 . 21) = 0 . 7 + 0 . 3 = 1 . 0 minsup , { e } is
an expected support-based frequent pattern .
However, as Prob ( sup ( { e } , D 2 )
=
1
=
=
0 . 21 + 0 . 58
=
{ e } is not a
minsup )
0 . 79 < minProb ,
probabilistic frequent pattern.
function (pmf). A pattern X is highly likely to be frequent (i.e., X is a probabilistic
frequent pattern) if and only if its frequentness probability is no less than minProb ,
i.e., P ( sup ( X , D )
minProb . The frequentness probability of X is the
probability that X occurs in at least minsup transactions of D . (Table 14.11 )
Note that frequentness probability is anti-monotonic: All subsets of a p-FP are
also p-FPs. Equivalently, if X is not a p-FP, then none of its supersets is a p-FP,
and thus all of them can be pruned. Moreover, when minsup increases, frequentness
probabilities of p-FPs decrease.
Bernecker et al. [ 15 ] used a dynamic computation technique in computing the
probability function f X ( k )
minsup )
k ), which returns the probability that
the support of a pattern X equals to k . Summing the values of such a probability
function f X ( k ) over all k
=
P ( sup ( X , D )
=
minsup gives the frequentness probability of X because
| D |
| D |
f X ( k )
=
P ( sup ( X , D )
minsup ) .
k minsup
k minsup
Any pattern X having the sum no less than minProb becomes a p-FP.
Sun et al. [ 49 ] proposed the top-down inheritance of support probability func-
tion (TODIS) algorithm, which runs in conjunction with a divide-and-conquer (DC)
approach, to mine probabilistic frequent patterns from uncertain data by extracting
 
Search WWH ::




Custom Search