Database Reference
In-Depth Information
BS: 1 1 1 1 0 0 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 0
P: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
BT: 1 1 1 1 0 0 1 1 0 0 0 1 1 1 0 0 0 0
IES:
continent: 0
country: 1, 17
city: 2, 6, 18
river: 11
name: 12
=T: 3, 7, 13, 19
IET:
mainland: 0
nation: 1, 11
city: 2, 6, 12
=T: 3, 7, 13
Fig. 3. Bit stream BS and inverted element lists IES of the source document given in Fig. 1(a),
and bit stream BT and inverted element lists IET of the target document given in Fig. 1(b)
In order to navigate within the Succinct compression, we have to consider two con-
secutive positions within the bit stream. Reading two '1'-bits corresponds to a first-
child event. Reading a '1'-bit following a '0'-bit corresponds to a next-sibling event.
Reading two '0'-bits corresponds to an end-of-parent event. If a self event is currently
relevant, we have to search the current index position within the inverted element list
in order to determine the label of the current node.
To skip a sub-tree means to proceed from the current position in the bit stream, un-
til we have read as many '1'-bits as '0'-bits.
2.3 Copying Compressed Sub-Trees
Generic Copying of Sub-Trees. The generic approach to copy a sub-tree navigates
through the sub-tree in the source document via the binary axes and inserts node by
node into the target document via the operations insertAsFirstChild or insertAsNext-
Sibling. As this corresponds to a decompression of the sub-tree to be copied, it is
more efficient to provide XML-representation specific implementations for copying a
sub-tree.
As the copying of sub-trees is only started after the XPath evaluation has stopped,
all root nodes of the sub-trees to be copied are known in advance. Therefore, the
whole copy-process can be performed within a (second) single pass through the
source and through the target document, assuming that we can use a cache that allows
us to store data that is read from the source document and will be appended to the
target document at a later point in time.
Copying Sub-Trees in Uncompressed XML. For each sub-tree to be copied, we
store the character index of the opening bracket '<' of the start-tag during the XPath
evaluation. In order to copy the sub-tree, the characters are copied and occurrences of
start- and end-tags are counted until as many start-tags as end-tags have been copied.
As target documents are generated in document order, the copied sub-tree is inserted
'as a whole' to the end of the target document.
Copying Sub-Trees in Succinct Compression. For each sub-tree to be copied, we
store the position n of the '1'-bit that represents the start-tag of the sub-tree's root
within the bit stream. In the bit stream, we copy as many bits, until the number of '1'-
bits that have been copied are equal to the number of '0'-bits that have been copied.
Furthermore, the elements found at these positions in the inverted element lists have
to be copied and their positions in the target bit stream have to be computed as
described below.
Search WWH ::




Custom Search