Database Reference
In-Depth Information
Item
Expected support
(
a
:0.2):1
(
a
:0.6):2
(
a
:0.9):1
a
2.3
b
2.2
c
1.8
(
b
:0.9):1
(
b
:0.6):2
(
b
:0.2):1
d
1.4
e
1.0
(
c
:0.4):1
(
c
:0.6):1
(
d
:0.5):1
(
c
:0.8):1
(
d
( :0.9):1
(
e
( :0.7):1
(
e
( :0.3):1
Fig. 14.3
The UFP-tree for the probabilistic dataset
D
2
of uncertain data
Table 14.4
Transaction caps for the probabilistic dataset
D
2
of uncertain data
Transaction ID
Set of items with existential probability
Transaction cap
t
1
{
a
:0.2,
b
:0.9,
c
:0.4}
0.36
t
2
{
a
:0.6,
b
:0.6,
c
:0.6,
d
:0.9}
0.54
t
3
{
a
:0.6,
b
:0.5,
d
:0.5,
e
:0.7}
0.42
t
4
{
a
:0.9,
b
:0.2,
c
:0.8,
e
:0.3}
0.72
5.3
CUF-growth
To further reduce the tree size (by reducing the number of tree nodes), Leung and
Tanbeer [
39
] proposed an uncertain frequent pattern mining algorithm called
CUF-
growth
, which builds a new tree structure called
CUF-tree
. Specifically, for each
transaction
t
i
, CUF-growth computes a
transaction cap
which is defined as follows.
Definition 14.2
The
transaction cap
, denoted by
cap
(
t
i
), of a transaction
t
i
is
defined as the product of the two highest existential probability values of items
within transaction
t
i
. Let
(i)
h
=|
t
i
|
represent the length of
t
i
,
(ii)
M
1
=
max
q
∈
[1,
h
]
P
(
x
q
,
t
i
), and
(iii)
M
2
=
max
r
∈
[1,
h
],
r
=
q
P
(
x
r
,
t
i
).
Then,
M
1
×
M
2
if
|
t
i
|
>
1
=
cap
(
t
i
)
(14.4)
P
(
x
1
,
t
i
)
if
|
t
i
|=
1 (i.e.,
t
i
={
x
1
}
)
Table
14.4
shows the transaction cap for each transaction in a probabilistic dataset
D
2
of uncertain data. The CUF-growth algorithm captures the transaction cap in the
CUF-tree. Unlike the UF-tree (which captures an item, its existential probability and
its occurrence count in each tree node), CUF-tree only capture (i) an item and (ii) its
transaction cap. Paths in a CUF-tree are shared if the nodes on these paths share
the same item. By doing so, the CUF-tree (for capturing uncertain data) can be as