Information Technology Reference
In-Depth Information
Table 10.5
Flat representation of
T
db
in Fig.
10.2
and Table
10.1
x
0
x
1
x
2
x
3
b
0
x
4
b
1
b
2
b
3
x
5
x
6
x
7
b
4
x
8
b
5
b
6
b
7
a
b
d
n
1
e
1
1
1
c
0
0
0
0
0
0
1
b
c
0
0
0
0
0
0
1
b
e
0
0
0
0
1
1
b
d
e
0
0
0
0
1
1
0
0
0
0
0
0
0
0
l
m
0
0
0
0
0
0
1
n
0
0
0
0
0
0
1
k
l
m
0
0
0
0
1
1
n
0
0
0
0
0
0
1
b
a
c
f
1
0
0
1
1
d
0
0
0
0
0
0
1
a
b
c
d
1
e
1
1
1
f
g
h
1
i
1
1
1
Table 10.6
Flat representation of
Tdb
in Fig.
10.2
and Table
10.1
when minimum support
=
3
x
0
x
1
x
2
x
3
b
0
b
1
b
2
x
4
b
3
a
b
c
n
1
1
1
c
1
b
c
0
0
0
0
1
b
1
b
d
e
0
0
1
1
0
0
l
m
0
0
0
0
1
n
1
k
l
m
0
0
1
1
n
1
b
a
c
f
1
1
1
d
1
a
b
c
d
1
1
1
f
1
the
k
th element (a label or a backtrack '-1') of
˕(
tid
i
)
. The flat data format or table
F
T
(
C
,
R
)(
C
=
columns,
R
=
rows) is set up where
C
={
c
0
,
c
1
,...,
c
m
−
1
}
(
m
=
|
(i.e. extra col-
umn for attribute names). The value in column number
x
and row number
y
is
denoted as
F
T
(
C
|=|
˕(
DSM
)
|
)
, and
R
={
r
0
,
r
1
,...,
r
p
−
1
}
(
p
=|
R
|=
n
+
1
)
c
x
,
r
y
)
. Hence, to set the attribute names
F
T
(
c
i
,
r
0
)
=
˕(
DSM
)
k
where
i
.
In addition, during the conversion process as mentioned in [
16
], one can incorpo-
rate the minimum support threshold
s
so that the DSM captures only those structural
characteristics that have occurred in at least
s
% of the tree database. Hence, in some
cases only a fraction of a tree instance can be matched to the DSM due to low occur-
rences in the tree database, but the partial information still needs to be included in
the resulting flat table. As an example, refer to the tree database
T
db
in Table
10.5
and
Fig.
10.2
, in mining the subtrees with minimum support threshold of 3, the resulting
DSM would be as follows: '
x
0
,
x
1
,
x
2
,
x
3
,
b
0
,
b
1
,
b
2
,
x
4
,
b
3
' and the new table is
shown in Table
10.6
.
=
k
={
0
,
1
,...,(
|
˕(
DSM
)
|−
1
)
}
10.3.4 Tree to Flat Conversion Example Using Academic
Institution WebLogs Data
Referring to the an Academic Institution WebLogs data example in Sect.
10.3.2
,
the pre-order encoding format of the tree database needs to be converted into a flat
representation as proposed by [
14
]. The DSM applications were described earlier in
Search WWH ::
Custom Search