Information Technology Reference
In-Depth Information
Session 0
Session 1
Session 2
0
0
0
1
7
10
1
20
7
17
1
2
8
5
11
12
13
16
21
22
18
3
4
14
15
6
9
19
Fig. 10.3
Integer-indexed tree of user sessions in Table 10.2
Table 10.4 An integer-indexed tree in Fig. 10.3 formatted as a string-like representation as used
in [ 51 ]
tid
cid
| S |
Pre-order(depth-first) encoding
0
0
25
0123 14 1 156 1 1 1789 1 1 11011 112 1 1
1
1
25
0 1 13 14
115
1
116
1
117
1
2
2
25
0 1 18 19
1
1
12021
1
1722
1
1
- - - -
tid transaction-id, cid omitted (i.e. equal to tid ),
|
S
|
size of string
In this example, the DSM is reflected in the structure of T 6 in Fig. 10.2 and it
becomes the header of the table to reflect the attribute names as explained pre-
viously. The string encoding is used to represent this uniform structure and since
the order of the nodes (and backtracks ('-1')) is important, the nodes and back-
tracks are labeled sequentially according to their occurrence in the string encod-
ing. For nodes (labels in the string encoding), x i is used as the attribute name,
where i corresponds to the pre-order position of the node in the tree, while for
backtracks, b j is used as the attribute name, where j corresponds to the backtrack
number in the string encoding. Hence, from our example in Fig. 10.2 and Table 10.1 ,
(DSM)
b 7 '.
To fill in the remaining rows, every transaction from T db is scanned and when a
label is encountered, it is placed in the matching column (i.e. under the matching
node ( x i ) in the DSM), and when a backtrack ('-1') is encountered, a value '1' (or
'y') is placed in the matching column (i.e. matching backtrack ( b j )inDSM).The
remaining entries are assigned values of '0' (or 'No', indicating non existence). The
flat data format of T db from Table 10.1 (and Fig. 10.2 ) is illustrated in Table 10.5 .
The conversion process can be formalized as follows. Let the tree database con-
sisting of n transactions be denoted as T db ={
=
' x 0 ,
x 1 ,
x 2 ,
x 3 ,
b 0 ,
x 4 ,
b 1 ,
b 2 ,
b 3 ,
x 5 ,
x 6 ,
x 7 ,
b 4 ,
x 8 ,
b 5 ,
b 6 ,
tid 0 ,
tid 1 ,...,
tid n 1 }
, and let the string
encoding of the tree instance at transaction tid i be denoted as
˕(
tid i )
.TheDSMis
extracted from T db using the procedure explained earlier. Further, let
| ˕(
tid i ) |
denote
the number of elements in
˕(
tid i )
, and
˕(
tid i ) k ( k
={
0
,
1
,..., | ˕(
tid i ) |−
1
}
) denote
 
Search WWH ::




Custom Search