Database Reference
In-Depth Information
columns of tuples 1 to n-1 belonging to the output columns of Q2 (i.e., the values of
all tuples of the inner query). If we consider the sub-sequence SSQ consisting of the
tuples 7-9 of Table 4, we first write the value 'Spain' (i.e., the value of tuple 7 - the
first tuple - belonging to the output columns of Q1) followed by the values C#023 and
C#101 (i.e., the value of tuples 7 and 8, which are the 1..n-1 values for size n=3 of
SSQ consisting of the tuples 7-9).
Retrieving the Compressed Document Structure. Let Q1 and Q2 be two nested
sub-queries, where Q1 contains Q2. At the beginning of each sub-sequence of tuples
within the result set belonging to Q1, we store a counter in the structure stream that is
initialized with 0 and that is incremented, whenever a tuple with the Query_ID of Q1
is read. Similarly, at the beginning of each nested sub-sequence of tuples within the
result set belonging to Q2, we add a new counter cQ2 in the structure stream that is
initialized with 0 and that is incremented, whenever a tuple with Query_ID of Q2 is
read. The counter cQ2 is closed (i.e., no more incrementation is possible), whenever a
tuple with Query_ID of Q1 is read.
If we apply this process to the query result shown in Table 3 of the query given in
Fig. 2, we get exactly the compressed XML document as given in Fig. 3.
Remember that decompression back to XML and querying the compressed docu-
ment can be done by the XSDS decompressor described in [1].
4 Evaluation
We have evaluated our approach using the database systems Oracle 10g Express and
IBM DB2 Express. As both have shown similar results, we concentrate on the DB2
results within this evaluation section.
We have used the TCP-H benchmark (http://www.tpc.org/tpch/) to create a rela-
tional database. We have tested 5 different kinds of queries, that select customers
sorted by nation (CN4 and CN16), customer data (C400 and C3200), article data (A4
and A16), supplier data (S4 and S16) and order data including customer and supplier
information (O2 and O4). Each of these queries contain a range clause within the
where clause, such that the result size can be scaled.
For the evaluation of the compression ratio reached by XSDS, please refer to [1].
Fig. 4 shows the query evaluation times for our set of queries for the indirect ap-
proach (i.e., evaluating SQL/XML query and then compressing the result) on the one
hand and for the direct approach (generating compressed XML directly from the
SQL/XML query and the relational data) on the other hand in relation to the
SQL/XML query evaluation time (100%). We can see that our approach not only
takes less time to compute the compressed data directly than the total time of the
indirect approach, but that for all queries tested, it can even directly compute the
compressed data in less time than the SQL/XML query evaluation alone takes. Fur-
thermore, we can see that our approach scales better for larger result sets, as for each
pair of queries that carry the same initial letters, where the result size was scaled up
(e.g. S4 and the 4 times larger S16), we can see that the performance gain compared
to the query evaluation time is better, when the result gets larger.
Search WWH ::




Custom Search