Database Reference
In-Depth Information
which is identical to the
and replace each call of it
HGE'(X) is correct and c
replacing HGE'(X) with
compression ratio is optimi
updating phase.
In order to find redund
existing and modified rule
use the information given b
the same sequence of othe
have to compare these rules
This reduces the number of
rule CDE(X). Therefore, we can delete the rule HGE'
t by a call of the rule CDE(X). The grammar using r
can be decompressed and processed correctly, but a
CDE(X), the grammar is more compact, i.e.,
zed. For this purpose, we perform a sharing phase after
(X)
rule
after
the
the
dant rules, we could compare the modified rule with
s. But this comparison becomes more efficient, when
by the EUD. Two rules can only be identical, if they
er grammar rules. For the EUD, this means that we o
s that have the same sequence of children within the EU
f comparisons within the sharing phase.
h all
we
call
only
UD.
4 Evaluation
All tests were performed on
RAM running our prototype
In a first series of measu
with two other approaches,
1998statistics (1998 - 6
(C1 - 10.4 MB) and dictio
XBench benchmark, hamle
35.5 MB) - data on the tu
23.0 MB) - data from the
(TB - 51.9 MB) -a parsed
that models auctions.
Usually, CluX compress
lowed by gzip.
In a second series of mea
the compressed data to the
uncompressed document as
performed) and recompress
n an Intel Core2 Duo CPU P870 @ 2,53 GHz with 4 GB
e on Java 1.6.
urements, we compared the compression strength of Cl
gzip and bzip2, based on the following XML datasets:
656 kB) - Baseball statistics of the year 1998, catalog
onary-01 (D1 - 10.4 MB) - documents generated by
et (H - 273 kB) - the Shakespeare play, JST_snp.chr (J
umor suppressor gene JST, and NCBI_gene.chr (NCB
e National Center for Biotechnical Information, Treeb
text corpus, and XMark (XM - 111.1 MB) - a docum
B of
luX
g-01
the
ST-
BI -
ank
ment
ses best (c.f. Fig. 6), followed by bzip2, and finally
fol-
asurements, we have compared the time for direct updates
e sum of the times needed for decompression, loading
s a DOM tree into main memory (i.e., no updates w
sion when using CluX, bzip2 or gzip as compression to
s on
the
were
ool.
Fig. 6. Compres
ssion ratios of CluX compared with bzip2 and gzip
Search WWH ::




Custom Search