Database Reference
In-Depth Information
are created by navigating through the XML representation. Whenever a result of an
XPath expression is reached, the result is bound to the XQuery variable $v assigned to
this XPath expression in the transformation instruction, and the automata for all rela-
tive expressions that refer to the variable $v as context node are turned active. If we
consider for example the source document and the XQuery transformation given in
Fig. 1, we have the absolute XPath expression XPa=/cont/country, the results of
which are bound to the variable $country and the relative XPath expression
XPr=$country/city. So whenever whenever a result of XPa is found (e.g. in line (2) of
the source document), the automaton for XPr is started, and from that moment on, the
XML events are not only passed to the automaton of XPa, but also to the automaton
of XPr. Only when the end of the result is reached, i.e., the end-tag of the root of the
sub-tree that is the result of the query XPa is read (e.g. in line (5) of the source docu-
ment), XPr is stopped, and it does not receive any further XML events.
In order to avoid any unnecessary parsing and decompression, the automata send
feedback to the source, which navigation events are currently relevant, i.e., they report
the set of labels of outgoing transitions of all currently active states. If we consider
e.g. the absolute XPath expression /cont/country, and if the current node CN of any
given source document is at level 2 (i.e., CN is a grand-child of the document root),
navigating down to any level below via the first-child axis cannot lead to a result.
Therefore, the whole sub-tree having the parent CN can be skipped and does not need
to be decompressed.
This evaluation of absolute and relative XPath expressions describes a generic op-
timization for XQuery evaluation that can be applied to all XML representations sup-
porting the basic operations, but this generic optimization need not be applied. Note
that whenever an XML representation supports a more efficient XPath evaluation for
a given XML document, this evaluation can be used instead of the generic part.
Navigation in Uncompressed XML. In order to navigate via the binary axes within
an uncompressed XML document, we have to consider pairs of tags. Reading a
start-tag following another start-tag corresponds to a first-child event. Reading a start-
tag following an end-tag corresponds to a next-sibling event. Reading an end-tag
following another end-tag corresponds to an end-of-parent event. Reading an end-tag
following a start-tag does not generate an event at all. Finally, if the current tag is a
start-tag with label 'a', this corresponds to a self::a event.
To skip a sub-tree means 'ignoring' its tags, i.e. skipping as many tags, until the
number of the skipped start-tags is equal to the number of skipped end-tags.
Navigation in Succinct Compression. The Succinct Compression [6] stores the tree
structure of the XML document in a bit stream that contains a '1'-bit for each start-tag
and a '0'-bit for each end-tag within the document. The labels of the document tree
are stored in so called inverted element lists, that contain for each label L the list of
positions of '1'-bits within the bit stream that correspond to an element with label L.
For example, Fig. 3 shows the bit stream BS and the inverted element lists IES of
the source document of Fig. 1(a) and the bit stream BT and the inverted element lists
IET of the target document of Fig. 1(b). The data structure P is the index of positions
within BS and BT to which the entries in the inverted element lists refer. The inverted
element list with label “=T” represents positions of text values within the bit stream,
and each text node is represented as a bit sequence '10' in the bit stream.
Search WWH ::




Custom Search