Emerging Database Systems in Support of Scientific Data - Scientific Data Management

Database Reference

In-Depth Information

B-tree variants or other indexing techniques designed to e ciently store and

retrieve variable-length data in columns, a requirement for profitable

exploitation of many data compression techniques

conjunctive search, join, and set algebra algorithms exploiting the column-

wise storage structure and working directly on compressed data

lazy decompression of data, that is, data are decompressed only as needed

instead of as soon as having been brought into main memory, is required

if such algorithms are to be used

compressed lists of tuple ID s to represent intermediate and final results in

such algorithms

vectorized operations on data streams and the vectorized dataflow network

architecture paradigm, to reduce call overhead costs and allow ecient

query evaluation by interpretation of algebraic expressions rather than

by compilation to low-level code

specially designed buffering techniques for storing and accessing metadata

and results of simple-transaction-type queries, which are in general not

well suited to column storage schemes

Next, we will discuss several of these approaches, with an emphasis on

those techniques that have been claimed in the literature to be of particular

importance in high-performance systems.

7.2 Architectural Principles of Vertical Databases

The architectural principles discussed in this section were proposed by several

groups who have designed different vertical databases over the years. Bringing

them together in this way does not mean that these principles can be arbi-

trarily combined with each other. However, they form a collection of ideas one

should probably be aware of when designing or acquiring such systems.

The literature review presented next shows that most of the advantages

of vertical storage in databases for analytical purposes have been known and

exploited since the early 80s at least, but recently there is renewed, widespread

market and research interest in the matter. The lack of interest in the past

now seems to reverse into what might be construed as a canonical vertical

storage architecture, replacing the previous consensus that the “flat file with

indexes” approach is always preferable.

7.2.1 Transposed Files and the Decomposed Storage Model

A number of early papers deal with issues related to how to group , cluster ,

or partition the attributes of a database table. For example, Navathe et al. 13

Search WWH ::

Custom Search

Home