Information Technology Reference
In-Depth Information
is when the sequence is already encoded in the Bitmap which only requires the update
of NB table (line 04): the line corresponding to this sequence in NB is incremented to
maintain its frequency counting. The second case is when the sequence itself is not
present in the Bitmap , but may be represented in SV . In this case, s will be encoded
and inserted in the Bitmap as a new line, involving the update of Index and NB ac-
cordingly (lines 06 and 07). The last case is when the sequence cannot be represented
in SV , i.e. it is affected by the Merge-sequence-vector( s ) function (Fig. 8.4). In this
case, the algorithm adds the process of shifting the Bitmap table in order to adapt
existing sequences coding to the new SV (line 08). Once all the data have been
encoded in this structure, new candidates (line 10) are generated (see candidates gen-
eration section) and compared to the data stored in the Bitmap (line 11) with a fast
access thanks to the Index (see support counting section).
8.3.2 Generation of the Sequence Vector
The sequence vector is generated during the unique scan of the database according to
the algorithm of Fig. 8.3, which depicts the Merge-sequence-vector function which
builds the SV vector according to the sequences present in the database.
Merge-sequence-vector (sequence s )
01 For each item a of s
-- SV is fetched by a position pointer initially set to 0
02
SV suffix -- after the current position
03 If (∃ b s suffix such that b SV suffix)
If a
insert a before b
04 Else insert a at the end of SV
05 update the current position of SV to the position of a
Fig. 8.3. Generation of the Sequence Vector
It takes as parameter a sequence s of the database. It fetches the sequence items one
by one checking their presence in SV in the right order (line 01). If a given item a is
not presents in SV after the current position (line 02), the function checks if their exist
an item b in SV such that b in located after a in s and also located after the
current position in SV . If those two conditions hold, item a will be inserted before
item b in SV (line 03). If only the first condition holds, a will be positioned at the end
of SV (line 04). The process will continue for the next item starting by the position of
item a in SV (line 05). Notice that SV will stay unchanged when the sequence items
belong to SV suffix. This holds when the current SV is sufficient to encode the new
sequence. In the example of Fig. 8.4, the arrow with value 1 shows an insertion of a
sequence HSH. This operation does not change SV. Since this sequence already exists
in the Bitmap, only the corresponding frequency in the NB table is incremented. The
arrow with value 2 shows an insertion of the sequence HWMWH. This operation
modifies the SV vector; then, shifting operations are applied to the Bitmap in order to
preserve the existing encoded sequences, and a new line is added in the Bitmap for
the new sequence. Finally, the frequency of this new sequence is set to one.
Search WWH ::




Custom Search