Database Reference
In-Depth Information
Null
Level 0
c
e
Level 1
a
b
d
f
ab
ac
ad
af
bc
bd
cd
cf
df
Level 2
abc
abd
acd
acf
ad
cd
cdf
Level 3
acdf
Level 4
Fig. 2.5 The lexicographic tree (also known as enumeration tree)
therefore, be used interchangeably. Thus, the problem of frequent itemset generation
is equivalent to that of constructing the enumeration tree. The tree can be grown
in a wide variety of ways such as breadth-first or depth-first order. Because most
of the discussion in this section will use this structure as a base for algorithmic
development, this concept will be discussed in detail here. The main characteristic
of tree-based algorithms is that the enumeration tree (or lexicographic tree) provides
a certain order of exploration that can be extremely useful in many scenarios.
It is assumed that a lexicographic ordering exists among the items in the database.
This lexicographic ordering is essential for efficient set enumeration without rep-
etition. To indicate that an item i occurs lexicographically earlier than j , we will
use the notation i
L j . The lexicographic tree is an abstract representation of the
large itemsets with respect to this ordering. The lexicographic tree is defined in the
following way:
￿
A node exists in the tree corresponding to each large itemset. The root of the tree
corresponds to the null itemset.
￿
Let I
be a large itemset, where i 1 , i 2 ...i k are listed in lexicographic
order. The parent of the node I is the itemset
={
i 1 , ...i k }
{
i 1 , ...i k 1 }
.
This definition of ancestral relationship naturally defines a tree structure on the nodes
that is rooted at the null node. A frequent 1-extension of an itemset such that the
last item is the contributor to the extension will be called a frequent lexicographic
tree extension , or simply a tree extension. Thus, each edge in the lexicographic tree
corresponds to an item which is the frequent lexicographic tree extension to a node.
The frequent lexicographic extensions of node P are denoted by E ( P ). An example
of the lexicographic tree is illustrated in Fig. 2.5 . In this example, the frequent
lexicographic extensions of node a are b , c , d , and f .
Search WWH ::




Custom Search