Information Technology Reference
In-Depth Information
Table 10.5 Flat representation of T db in Fig. 10.2 and Table 10.1
x 0
x 1
x 2
x 3
b 0
x 4
b 1
b 2
b 3
x 5
x 6
x 7
b 4
x 8
b 5
b 6
b 7
a
b
d
n
1
e
1
1
1
c
0
0
0
0
0
0
1
b
c
0
0
0
0
0
0
1
b
e
0
0
0
0
1
1
b
d
e
0
0
0
0
1
1
0
0
0
0
0
0
0
0
l
m
0
0
0
0
0
0
1
n
0
0
0
0
0
0
1
k
l
m
0
0
0
0
1
1
n
0
0
0
0
0
0
1
b
a
c
f
1
0
0
1
1
d
0
0
0
0
0
0
1
a
b
c
d
1
e
1
1
1
f
g
h
1
i
1
1
1
Table 10.6 Flat representation of Tdb in Fig. 10.2 and Table 10.1 when minimum support
=
3
x 0
x 1
x 2
x 3
b 0
b 1
b 2
x 4
b 3
a
b
c
n
1
1
1
c
1
b
c
0
0
0
0
1
b
1
b
d
e
0
0
1
1
0
0
l
m
0
0
0
0
1
n
1
k
l
m
0
0
1
1
n
1
b
a
c
f
1
1
1
d
1
a
b
c
d
1
1
1
f
1
the k th element (a label or a backtrack '-1') of
˕(
tid i )
. The flat data format or table
F T
(
C
,
R
)(
C
=
columns, R
=
rows) is set up where C
={
c 0 ,
c 1 ,...,
c m 1 } (
m
=
|
(i.e. extra col-
umn for attribute names). The value in column number x and row number y is
denoted as F T (
C
|=| ˕(
DSM
) | )
, and R
={
r 0 ,
r 1 ,...,
r p 1 } (
p
=|
R
|=
n
+
1
)
c x ,
r y )
. Hence, to set the attribute names F T (
c i ,
r 0 ) = ˕(
DSM
) k
where i
.
In addition, during the conversion process as mentioned in [ 16 ], one can incorpo-
rate the minimum support threshold s so that the DSM captures only those structural
characteristics that have occurred in at least s % of the tree database. Hence, in some
cases only a fraction of a tree instance can be matched to the DSM due to low occur-
rences in the tree database, but the partial information still needs to be included in
the resulting flat table. As an example, refer to the tree database T db in Table 10.5 and
Fig. 10.2 , in mining the subtrees with minimum support threshold of 3, the resulting
DSM would be as follows: ' x 0 , x 1 , x 2 , x 3 , b 0 , b 1 , b 2 , x 4 , b 3 ' and the new table is
shown in Table 10.6 .
=
k
={
0
,
1
,...,( | ˕(
DSM
) |−
1
) }
10.3.4 Tree to Flat Conversion Example Using Academic
Institution WebLogs Data
Referring to the an Academic Institution WebLogs data example in Sect. 10.3.2 ,
the pre-order encoding format of the tree database needs to be converted into a flat
representation as proposed by [ 14 ]. The DSM applications were described earlier in
 
 
Search WWH ::




Custom Search