Information Technology Reference
In-Depth Information
(
+1) super-pattern can never be frequent. The essential idea is to iteratively
generate the set of candidate patterns of length (
k
k
+ 1) from the set of frequent
patterns of length
, and check their corresponding occurrence frequencies in the
database. The bottleneck of the Apriori method is at the candidate set generation
and test. If one can avoid generating a huge set of candidates, the mining
performance can be substantially improved.
A frequent pattern tree (FP-tree in short) is a tree structure which consists of
one root labeled as “null”, a set of item prefix subtrees as the children of the root,
and a frequent-item header table.
Each node in the item prefix subtree consists of three fields: item-name, count,
and node-link, where item-name registers which item this node represents, count
registers the number of transactions represented by the portion of the path
reaching this node, and node-link links to the next node in the FP-tree carrying
the same item-name, or null if there is none. Each entry in the frequent-item
header table consists of two fields, item-name and head of node-link, which
points to the first node in the FP-tree carrying the item-name. FP-tree
construction algorithm shows as follows.
k
Algorithm 12.2 FP-tree construction (Han et al, 2000)
Input: DB, a database of transactions;
minsup, the minimum support count threshold.
Output: FP-tree, frequent pattern tree.
Method:
1. Scan the transaction database DB once. Collectthe set of frequent items
F
and
their support
s
. Sort
F
in support descending order as
L
, the list of frequent items.
2. Create the root of an FP-tree,
T
, and label it as “null”. For each transaction
T
in DB do thefollowing. Select and sort the frequent items in
T
according to the
order of
L
. Let the sorted frequent item list in
T
be [
p|P
], where
p
is the first
element and
P
is the remaining list. Call insert_tree([
p|P
],
T
).
Procedure : insert_tree([
p
|
P
],
T
)
1. If
T
has a child
N
and
N
.item-name=
p
.item-name, then increment
N
's count
by 1;
2. else create a new node
N
, and let its count be 1, its parent link be linked to
T
,
Search WWH ::




Custom Search