Information Technology Reference
In-Depth Information
Session 0
Session 1
Session 2
0
0
0
1
7
10
1
20
7
17
1
2
8
5
11
12
13
16
21
22
18
3
4
14
15
6
9
19
Fig. 10.3
Integer-indexed tree of user sessions in Table
10.2
Table 10.4
An integer-indexed tree in Fig.
10.3
formatted as a string-like representation as used
in [
51
]
tid
cid
|
S
|
Pre-order(depth-first) encoding
0
0
25
0123
−
14
−
1
−
156
−
1
−
1
−
1789
−
1
−
1
−
11011
−
112
−
1
−
1
1
1
25
0 1 13 14
−
115
−
1
−
116
−
1
−
117
−
1
2
2
25
0 1 18 19
−
1
−
1
−
12021
−
1
−
1722
−
1
−
1
- - - -
tid
transaction-id,
cid
omitted (i.e. equal to
tid
),
|
S
|
size of string
In this example, the DSM is reflected in the structure of
T
6
in Fig.
10.2
and it
becomes the header of the table to reflect the attribute names as explained pre-
viously. The string encoding is used to represent this uniform structure and since
the order of the nodes (and backtracks ('-1')) is important, the nodes and back-
tracks are labeled sequentially according to their occurrence in the string encod-
ing. For nodes (labels in the string encoding),
x
i
is used as the attribute name,
where
i
corresponds to the pre-order position of the node in the tree, while for
backtracks,
b
j
is used as the attribute name, where
j
corresponds to the backtrack
number in the string encoding. Hence, from our example in Fig.
10.2
and Table
10.1
,
(DSM)
b
7
'.
To fill in the remaining rows, every transaction from
T
db
is scanned and when a
label is encountered, it is placed in the matching column (i.e. under the matching
node (
x
i
) in the DSM), and when a backtrack ('-1') is encountered, a value '1' (or
'y') is placed in the matching column (i.e. matching backtrack (
b
j
)inDSM).The
remaining entries are assigned values of '0' (or 'No', indicating non existence). The
flat data format of
T
db
from Table
10.1
(and Fig.
10.2
) is illustrated in Table
10.5
.
The conversion process can be formalized as follows. Let the tree database con-
sisting of
n
transactions be denoted as
T
db
={
=
'
x
0
,
x
1
,
x
2
,
x
3
,
b
0
,
x
4
,
b
1
,
b
2
,
b
3
,
x
5
,
x
6
,
x
7
,
b
4
,
x
8
,
b
5
,
b
6
,
tid
0
,
tid
1
,...,
tid
n
−
1
}
, and let the string
encoding of the tree instance at transaction
tid
i
be denoted as
˕(
tid
i
)
.TheDSMis
extracted from
T
db
using the procedure explained earlier. Further, let
|
˕(
tid
i
)
|
denote
the number of elements in
˕(
tid
i
)
, and
˕(
tid
i
)
k
(
k
={
0
,
1
,...,
|
˕(
tid
i
)
|−
1
}
) denote
Search WWH ::
Custom Search