Database Reference
In-Depth Information
{}
{}
{}
a:4
b:1
a:4
b:1
a:4
b:1
POINTER
CHASING
OF e
b:3
f:1
f:1
b:3
f:1
b:3
f:1
REMOVAL
OF e
c:3
e:1
c:3
e:1
c:3
d:2
f:1
d:2
d:2
e:2
d:1
e:2
ORIGINAL FP-TREE
CONDITIONAL FP-TREE
WITH SUFFIX ITEM e
Fig. 2.12 Generating a conditional FP-Tree by pointer chasing
paths for the item i . The infrequent nodes from these paths are removed, and they are
put together to create a conditional FP-Tree FPT i . Because the infrequent items have
already been removed from FPT i the new conditional FP-Tree also contains only
frequent items. Therefore, in the next level recursive call, any item from FPT i can be
appended to P i to generate another pattern. The supports of those patterns can also be
reconstructed via pointer chasing during the process of reporting the patterns. Thus,
the current pattern suffix P is extended with the frequent item i appended to the front
of P . This extended suffix is denoted by P i . The pattern P i also needs to be reported
as frequent. The resulting conditional FP-Tree FPT i is the compressed database
representation of
T i of Fig. 2.9 in the previous section. Thus, FPT i is a smaller
conditional tree that contains information relevant only to the extraction of various
prefix paths relevant to different items that will extend the suffix P i further in the
backwards direction. Note that infrequent items are removed from FPT i during this
step, which requires the support counting of all items in FPT i . Because the pointers
have not yet been constructed for FPT i , the support of each item-extension of
P
corresponding to the items in FPT i must be explicitly determined by locating each
instance of an item in FPT i . This is the primary computational bottleneck step. The
removal of infrequent items from FPT i may result in a different structure of the
FP-Tree in the next step.
Finally, if the conditional FP-Tree FPT i is not empty, the FP-growth method is
called recursively with parameters corresponding to the conditional FP-Tree FPT i ,
extended suffix P i , and minimum support s . Note that successive projected trans-
action databases (and corresponding conditional FP-Trees) in the recursion will be
smaller because of the recursive projection. The base case of the recursion occurs
when the entire FP-Tree is a single path. This is likely to occur when the projected
transaction database becomes small enough. In that case, FP-growth determines all
combinations of nodes on this path, appends the suffix P to them, and reports them.
An example of how the conditional FP-Tree is created for a minimum support of
1 unit, is illustrated in Fig. 2.12 . Note that if the minimum support were 2, then the
right branch (nodes b and f ) would not be included in the conditional FP-Tree. In this
case, the pointers for item e are chased in the FP-Tree to create the conditional prefix
paths of the relevant conditional transaction database. This represents all transactions
{
i
}∪
Search WWH ::




Custom Search