Database Reference
In-Depth Information
In order to illustrate the process of copying a sub-tree from the source Succinct re-
presentation into the target Succinct representation, we assume that we want to copy
the sub-tree BS' of the source document that is represented by the positions 18 to 21
of the bit stream BS given in Fig. 3. The affected bits and inverted element list entries
are emphasized by bold letters. Furthermore, we assume that we want to append them
to the target document that up to now consists of the positions 0 to 11 of the bit
stream BT of the target document of Fig. 3, i.e., we want to insert the copied sub-tree
at position k=12.
Having copied the bits from position n (i.e. 18) to position m (i.e. 21) of BS to BT,
we run through the inverted element list and copy all positions x with n≤x≤m (i.e.
position 18 of the list “city” and position 19 of the list “=T”). As we only append new
nodes to the current end of the target document, we do not have to re-compute index
positions within the inverted element lists of nodes that have been previously added to
the target document. In order to insert the copied sub-tree at position k (i.e. 12), we
add k-n to each copied position (i.e., we add 12-18=-6 to the positions 18 and 19 and
yield the new positions 12 and 13). Then, we can insert the list of positions of the
copied sub-tree into the list of positions of the target document for each label (i.e., we
add positions 12 to the list “city” of BT and position 13 to the list “=T” of BT).
Even if we have to copy more than one sub-tree, we can copy the relevant parts of
the bit stream and the inverted element lists within a single pass. We first sort the
positions of the sub-trees to be copied in ascending order, process the bit stream and
copy all relevant bits, and determine the start and end positions of the copied sub-
trees. Then, we can run a single pass through the inverted element lists in which we
adapt the positions and insert the positions that were copied.
3 Evaluation of Our Prototype Implementation
To evaluate the performance of our Java prototype implementation, we conducted
several tests. We ran the tests on an Intel Core i5 450M with 2.4 GHz and 4 GB
memory using Linux and JRE 1.6.0. The tests compare our prototype implementation
to version 1.4.0 of the XQuery evaluator Zorba (http://www.zorba-xquery.com/),
started by zorba -f -q queryFile -o outputFile.
For each test, we compared five methods of transformation. Three methods use our
prototype implementation: (1) transforming uncompressed data, (2) indirectly trans-
forming data compressed via Succinct, i.e., first decompression, then uncompressed
transformation, and, finally, compression of the transformation's result and (3) direct
transformation of data compressed via Succinct. Two methods use Zorba: (4) trans-
forming uncompressed data and (5) indirectly transforming data compressed via Suc-
cinct as explained above. With Zorba, it is not possible to transform compressed data
directly.
In the first test, we use an XML source file of variable size consisting of the root
node <Country > that itself consists of a single <Name> node and of N <City > nodes.
Based on that XML source file, we evaluate an XQuery that selects the root node
<Country > . Afterwards, using the root node as context node, it evaluates the two
relative XPath expressions $country/Name and $country/City . For each <City > node,
the selected nodes are copied to the transformation's target document.
Search WWH ::




Custom Search