Database Reference
In-Depth Information
There are two types of compression, namely
lossless and lossy compression. In lossless data
compression, the decompressed data is an exact
replica of the original data. On the other hand, in
lossy data compression, the decompressed data
may be different from the original data. Typically,
there is some distortion between the original and
reproduced data. Data compression must be loss-
less for typical MOLAP applications.
In this chapter we review MOLAP compres-
sion schemes, discuss important issues related to
compression of MOLAP and existing techniques
and also discuss future trends. The chapter is struc-
tured as follows: section 2 describes compression
mechanisms of many existing MOLAP compres-
sion schemes. Section 3 reviews other related work
in MOLAP compression. Section 4 discusses some
relevant quality issues in MOLAP compression
and existing compression schemes and section 5
discusses future trends. Section 6 discusses some
limitations of compression schemes and section
7 concludes the chapter.
coding, bitmap compression and finally history
offset compression.
The compression techniques usually provide
two mappings. One is forward mapping, comput-
ing the location in the compressed dataset given
a position in the original dataset. The other one
is backward mapping , computing the position
in the original dataset given a location in the
compressed dataset. A compression method is
called mapping-complete if it provides forward
mapping and backward mapping. The term logical
database and physical database is used to refer
to the uncompressed and compressed database
respectively.
Some mapping complete compression schemes
such as header compression, BAP compression,
run length encoding, and bit map compression
first transform the multidimensional data into a
linearized array using the array linearized func-
tion. Then the linearized data are compressed by
a mapping complete compression method. Li and
Srivastava (2002) applied this idea for implement-
ing compressed MOLAP using header compres-
sion method. Hence those mapping complete
compression schemes are used for compressing
higher dimensional data sets after linearizing the
data using the array linearization function.
coMPreSSIon ScheMeS for
MultIdIMenSIonAl ArrAyS
Efficiently computing aggregations on com-
pressed data warehouses is crucial once the large
multidimensional databases are to be compressed
for storage and efficiency reasons. This com-
pression must be lossless for data warehousing
applications, in order to allow the original data
to be fully recovered from its compressed form.
In this section we discuss several compression
schemes that are applied to MOLAP. We start by
discussing multidimensional array linearization,
which may be used as part of many compression
schemes. After that we review a set of compres-
sion techniques that includes chunk-offset com-
pression, compressed row or column storage,
extended Karnaugh map representation, header
compression, BAP compression, run-length en-
Multidimensional Array linearization
Figure 1 is an example of mapping a relational table
to multidimensional array. In Traditional Multi-
dimensional Array (TMA) based implementation
of a MOLAP scheme, each of the kth column of
an n column relational table is mapped to the kth
dimension of the multidimensional array. Each
column value is mapped to a unique subscript
and the measure value (i.e. sales value) of the
relational table is inserted into the corresponding
cell in the multidimensional array. Therefore, each
record of the relation can be expressed as one cell
in the multidimensional array, if each column of
the relation is assigned to each dimension of the
Search WWH ::




Custom Search