Database Reference
In-Depth Information
some special treatment is done, the array size will
overflow soon, even if the k bits are used both for
history and offset values. For a query of the form
column name<value or column name>value or
column name between value1 and value2”, it is
necessary to visit different subarrays in the extend-
ible array, whereas in traditional multidimensional
arrays the records are organized linearly in only
one stream, hence the query time of traditional
arrays outperforms that of the extendible array.
But when the compression scheme is applied,
then the retrieval time is very similar for both the
cases (Hasan et, al; 2007), while there is a gain
concerning extension capability when history
offset compression is used.
small result sets; On the other hand, small chunk
sizes may require more disk accesses to retrieve
all chunks required to answer a query. Moreover,
the chunk shape influences the number of chunks
retrieved in answering a query. It is shown by
Zhao et, al; 1997 that the optimal dimension
order for chunking multidimensional arrays is
dd d
£££...
d n
here d i denotes the size
1
2
3
of the dimension.
In order to have a good compression rate for
data cubes, image compression approaches can
also be applied, where simple blocks are com-
pressed using many image compression formats
including JPEG in David et, al; (1998). The im-
ages are divided into arbitrarily shaped regions
through adaptive (Pennec & Mallat, 2000) or
non-adaptive algorithms (Emmanuel & David,
1999). It is not clear a priori that more complex
shapes or algorithms lead to more efficient storage.
Another argument for block-coded data cubes is
that many efficient buffering schemes for OLAP
range queries rely themselves on block coding
(Geffner et, al; 1999 and Lemire, 2002).
Bitmap indices can be used for two important
purposes-, namely basic bitmap index and bit-
map compression or bitmap index compression
(Stabno & Wrembel, 2007). Beside the basic
bit map compression, there are also ideas of
bitmap compression like Byte-aligned Bitmap
Compression (Antoshenkov & Ziauddin 1996),
Word-Aligned Hybrid (Wu et. al; 2004) (Stock-
inger & Wu 2007), and Approximate Encoding
(Apaydin et, al; 2006). The first two techniques
use the basic RLE compression (discussed in this
chapter) algorithm where continuous vectors of
bits with the same bit value (either “0” or “1”) are
represented as one occurrence of the value and
the number of the values. Byte-aligned Bitmap
Compression divides a bitmap into 8-bit words,
whereas Word-Aligned Hybrid divides a bitmap
into 31-bit words. Both techniques offer the best
compression ratio for bitmaps describing rows
ordered by the value of an indexed attribute. The
AddItIonAl relAted Work
And PerforMAnce ISSueS
In this section we review briefly other related
work on the issue of MOLAP compression and
performance.
The performance of accessing elements of large
multidimensional arrays is strongly affected by the
order of the dimensions, since the array is stored
by a predetermined order of dimensions or axes.
The multidimensional array having a number of
dimensions n can be accessed or linearized in n!
ways (Rotem & Zhao, 1996). Hence, the perfor-
mance of aggregation computation for a slice of
an array might greatly differ depending on which
slice operation is done (Shimada et, al; 2006 and
Seamons & Marianne, 1994).
To cope with the dimension dependency, array
chunking can be employed (Zhao et, al; 1997) to
get uniform treatment of all the dimensions. An
n-dimensional array is divided into small sized
n-dimensional chunks and each chunk is stored
as a contiguous page on the secondary storage.
The dimension dependency becomes moderate
by chunking, but chunk size and shape have great
effects on array chunking: large chunk sizes may
cause unnecessary data to be read for queries with
Search WWH ::




Custom Search