Database Reference
In-Depth Information
{}
{}
{}
a:4
b:1
a:4
b:1
a:4
b:1
POINTER
CHASING
OF e
b:3
f:1
f:1
b:3
f:1
b:3
f:1
REMOVAL
OF e
c:3
e:1
c:3
e:1
c:3
d:2
f:1
d:2
d:2
e:2
d:1
e:2
ORIGINAL FP-TREE
CONDITIONAL FP-TREE
WITH
SUFFIX
ITEM
e
Fig. 2.12
Generating a conditional FP-Tree by pointer chasing
paths for the item
i
. The infrequent nodes from these paths are removed, and they are
put together to create a conditional FP-Tree
FPT
i
. Because the infrequent items have
already been removed from
FPT
i
the new conditional FP-Tree also contains only
frequent items. Therefore, in the next level recursive call, any item from
FPT
i
can be
appended to
P
i
to generate another pattern. The supports of those patterns can also be
reconstructed via pointer chasing during the process of reporting the patterns. Thus,
the current pattern suffix
P
is extended with the frequent item
i
appended to the front
of
P
. This extended suffix is denoted by
P
i
. The pattern
P
i
also needs to be reported
as frequent. The resulting conditional FP-Tree
FPT
i
is the compressed database
representation of
T
i
of Fig.
2.9
in the previous section. Thus,
FPT
i
is a smaller
conditional tree that contains information relevant only to the extraction of various
prefix paths relevant to different items that will extend the suffix
P
i
further in the
backwards direction. Note that infrequent items are removed from
FPT
i
during this
step, which requires the support counting of all items in
FPT
i
. Because the pointers
have not yet been constructed for
FPT
i
, the support of each item-extension of
P
corresponding to the items in
FPT
i
must be explicitly determined by locating each
instance of an item in
FPT
i
. This is the primary computational bottleneck step. The
removal of infrequent items from
FPT
i
may result in a different structure of the
FP-Tree in the next step.
Finally, if the conditional FP-Tree
FPT
i
is not empty, the
FP-growth
method is
called recursively with parameters corresponding to the conditional FP-Tree
FPT
i
,
extended suffix
P
i
, and minimum support
s
. Note that successive projected trans-
action databases (and corresponding conditional FP-Trees) in the recursion will be
smaller because of the recursive projection. The base case of the recursion occurs
when the entire FP-Tree is a single path. This is likely to occur when the projected
transaction database becomes small enough. In that case,
FP-growth
determines all
combinations of nodes on this path, appends the suffix
P
to them, and reports them.
An example of how the conditional FP-Tree is created for a minimum support of
1 unit, is illustrated in Fig.
2.12
. Note that if the minimum support were 2, then the
right branch (nodes
b
and
f
) would not be included in the conditional FP-Tree. In this
case, the pointers for item
e
are chased in the FP-Tree to create the conditional prefix
paths of the relevant conditional transaction database. This represents all transactions
{
i
}∪