Database Reference
In-Depth Information
in reducing the cost of data communication in
distributed networks. In order to use or interpret
compressed data, it is necessary to restore the
information to its uncompressed format. To do
this, a decoding algorithm must be available,
and performance concerns are relevant for that
operation. In some applications, data compres-
sion can also lead to other types of improvement
in system performance. For example, in some
index structures it is possible through compres-
sion to pack more keys into each index block.
When the database is searched for a given key
value, the key is first compressed and the search
is performed against the compressed keys in the
index blocks. The net effect is that fewer blocks
have to be retrieved and thus the average search
cost is reduced.
On-line Analytical Processing (OLAP) is a
database acceleration technique used for deduc-
tive analysis. The main objective of OLAP is to
have constant-time or near constant-time answers
for many typical queries. There are two types of
OLAP, namely ROLAP (Relational OLAP) and
Multidimensional Online Analytical Processing
(MOLAP). In ROLAP, the data is usually stored in
the form of “summary tables”. ROLAPs are built
on top of standard relational database systems,
whereas MOLAPs are based on multidimen-
sional database systems. The data structures in
which ROLAPs and MOLAPs store datasets are
fundamentally different. ROLAPs use relational
tables as their basic data structure and MOLAPs
store their datasets as multidimensional arrays.
Those large multi-dimensional arrays are used as
basic data structures for scientific computations,
business analysis, and visualization, where huge
amounts of data manipulation are necessary.
The multi-dimensional rectangular arrays, both
dense and sparse depending on the context, form
the fundamental abstract data structure used in
different computation schemes. One area where
multidimensional arrays are commonly used is
data warehousing and Online Analytical Process-
ing (OLAP), which often requires extraction of
statistical information for decision support.
In MOLAP applications, data compression is
important because database performance strongly
depends on the amount of available memory. A
MOLAP is a set of multidimensional datasets and
is designed to allow for the efficient and conve-
nient storage and retrieval of large volumes of
data that is closely related, viewed and analyzed
from different perspectives. The multidimensional
arrays that are linearized to store multidimensional
datasets normally have high degree of sparsity and
need to be compressed. It is therefore desirable
to develop techniques that can access the data
in their compressed form and can perform logi-
cal operations directly on the compressed data.
Multidimensional arrays are good to store dense
data, but most datasets are sparse, which wastes
huge memory, since a large number of array cells
are empty and thus are very hard to use in actual
implementations. In particular, the sparsity prob-
lem increases when the number of dimensions
increases. This is because the number of all pos-
sible combinations of dimension values increases
exponentially, whereas the number of actual data
values would not increase at such a rate. Efficient
storage schemes are required to store such sparse
data for multidimensional arrays for MOLAP
implementations. In this chapter, a survey of the
compression schemes for multidimensional data
is presented. The data compression techniques are
not only important for data warehousing imple-
mentation but also for any kind of large database
implementation such as Scientific and Statistical
Databases (SSDB).
Some of the most relevant issues concern-
ing data compression are: the ability to perform
efficient and random searching in compressed
databases for a given logical position in the
original database; and then the ability to provide
an efficient mapping from arbitrary positions in
the compressed data back to the corresponding
logical position in the original database.
Search WWH ::




Custom Search