Updates on Grammar-Compressed XML Data - Advances in Databases

Database Reference

In-Depth Information

The Parallel Update Problem Definition: Given a set of update operations, the

parallel update problem is to compute a DAG of update paths and to isolate the DAG

of update paths from the grammar G in parallel in order to keep the compressed

grammar small and in order to keep the update process fast.

3.2 First Step of the Parallel Update Process: Constructing an Update DAG

(UD)

Updating the grammar-compressed data is done by a two-step approach on the given

grammar G.

In a first step, we navigate through G and identify the paths that have to be updated

and combine them to an update DAG. Instead of collecting all the update paths

individually while navigating through G, we combine all the update paths having a

common prefix to construct an update tree.

To combine a collection of update paths into an update tree, first, all copies of the

start node, i.e. S' and S'' in the previous example, are renamed to a single copy S' of

the start node. This reflects the fact that all update paths that are combined into an

update tree start at the same root node S'.

Thereafter, each set of common prefixes of multiple update paths is combined to a

common prefix in the update tree as follows. Whenever the update tree contains two

edges (N i ,I,N j ) and (N i ,I,N k ) where neither N j nor N k contains an update operation, the

node N k is renamed to N j . This reflects the fact that both edges represent the unique I th

occurrence of a non-terminal in the grammar rule represented by N i .

Continuing the previous example, the update tree has a common start node S', but

has no common edge of its two update paths.

In addition to combining updates with common prefixes within the update tree, we

transform the update tree into an update DAG (UD). The UD is constructed bottom up

from the update tree by sharing equal sub-trees. Two leaf nodes within the update tree

are equal, if they have the same label, i.e. they contain the same update operation. Two

inner nodes U 1 and U 2 of the update DAG are equal if they are copies of the same

grammar rule and have similar outgoing edges, where two outgoing edges (U 1 , I 1 , U 3 )

and (U 2 , I 2 , U 4 ) from nodes U 1 and U 2 respectively are similar if I 1 =I 2 and U 3 =U 4 .

Continuing the previous example, the sub-DAG rooted in HGE(X)' is equal to the

sub-DAG rooted in HGE(X)''. As similar edges are stored only once in the update

DAG (UD), UD contains the edges { (S',2,HGE(X)'), (S',4,HGE(X)'), (HGE(X)', 1,

JB(X)'), (JB(X)',2,reLTo(z)) } and is shown in Figure 4.

3.3 Second Step of the Parallel Update Process: Isolating UD from GD

The second step of the parallel update process is to isolate the update DAG (UD) from

the grammar DAG (GD) by isolating all the update paths contained in UD from GD in

parallel. UD isolation is done by combining UD and GD into single DAG, called the

extended update DAG (EUD), which is done by adding edges from certain UD nodes

to certain GD nodes, such that after the extension, the EUD represents the result of

isolating the original UD from GD. The UD isolation and update execution procedure

is summarized in Algorithm 1 and is described in the following.

Search WWH ::

Custom Search

Home