Updates on Grammar-Compressed XML Data - Advances in Databases

Database Reference

In-Depth Information

U is an update operation, is used for modifying the grammar rule Gi represented by

Ui. The update operation U is applied to the I th terminal symbol T of Gi, e.g., for the

edge (JB(X)', 2, reLTo(z) ), the grammar rule JB(X)' is updated by replacing the 2 nd

terminal symbol, i.e. b, with z. Depending on the particular update operation U

described by the edge (U i , I, U), we do the following.

If U= reLTo(z)) , we substitute the I th terminal symbol of Gi, i.e. T, with z.

If U=newFC(cst) or U=newNS(cst), i.e., if the update requires inserting a

compressed sub-tree cst as a first-child or as a next-sibling of T respectively, we have

to set the current first-child or next-sibling of T as the new next-sibling of the root of

cst. As we might insert the same sub-tree cst at several places within the original

XML document, the most efficient way to do this is to set the next-sibling of the root

rule of cst (which is a null pointer, i.e.

, before the insertion) to a parameter, and to

replace the first-child fc or the next-sibling ns of T by a call of the root rule of cst with

the parameter value fc or ns respectively.

If we consider for example the node 'k1' as selected node in Grammar 3 and want

to insert a node z(

) as the first-child of /k1, i.e., cst consists of the single rule Z(X)

 z(

,X) into Grammar 3 and replace the

current first-child cfc of k1 in Grammar 3 with a call Z(cfc) of this new rule. Thereby,

cfc becomes the next-sibling of z. This means, that the call ' k1(CDE(m (

,X), we would insert a new rule Z(X)  z(

) ),...) ' within

the start rule S of Grammar 3 is replaced with the call ' k1( Z (CDE(m (

) )),...) '.

If U=del, i.e., the update requires deleting the sub-tree having its root in T, we can

simply replace T(FC,NS) with T's own next-sibling NS in the grammar rule

represented by U i . If we remove a formal parameter from a rule during the deletion,

we have to delete the corresponding actual parameter within each call of the modified

rule as well. In the worst case, i.e., if the actual parameter of a rule is defined in the

rule represented by S', we have to modify all the rules represented by nodes of EUD

that lay on paths from S' to U i within the EUD.

For example, if we delete the nodes of the sub-trees, the root of which is selected

by //d//j, i.e. is the j node with first child m or the j node with fist child o, UD contains

the edges (S',1,CDE(X)') , (S',3,CDE(X)') , (CDE(X)',1,JB(X)') , and (JB(X)',1,del).

By applying the delete operation to the first non-terminal of the JB(X)' rule, i.e. to j,

the right-hand side of this rule is replaced with b(

). As now the JB(X)' rule does

not contain a parameter anymore, the parameter has to be removed from the CDE(X)'

rule calling it too. And finally, we have to delete the parameters used in the rule calls

of rule CDE(X)' within the S' rule too. By doing this, the first-child nodes of nodes

//d//j with labels m and o are deleted as well.

3.5 Sharing Identical Nodes

Although the initial grammar to be updated does not contain anymore sub-structures

that can be shared, during the update process new sub-structures are generated that

might be similar or identical to already existing sub-structures.

For example, if we re-label in our example the node /k2//g to 'c' and the node

/k2//h to 'd', after the UD isolation and the update process, the grammar would

contain a rule

HGE'(X)  c(d( JB( X),e (

) ),

)

Advances in Databases

Search WWH ::

Custom Search

Home