Database Reference
In-Depth Information
as possible, given the above-mentioned constraints. The algorithm solved this
problem in time O(n).
For reading and writing data in b-lists, six access procedures are avail-
able, allowing sequential as well as direct-addressed read and write access of
data segments. One of the sequential read procedures does not unpack run-
compressed data sequences, enabling fast access to such data. This facility is
used by the conjunctive query search algorithm as well as by the merge-join
algorithm.
7.2.5 Buffering Techniques for Accessing Metadata
In vertical databases, metadata usually play an important role, not only to
support users with information about database contents, but also internally to
support parameter and algorithm selection during runtime. In order to exploit
the advantages of vertical segmentation to achieve the best possible runtime
performance, algorithms for searching and accessing data need to frequently
consult the metadatabase (MDB) for information about the current status of
database contents. For example, to find the fastest way to search a run-length
compressed, fully transposed, ordered file (CFTOF search, see Section 7.2.6),
it is important to be able to quickly access at runtime certain properties of
the attributes involved in the query, in particular their sort key position, as
well as their cardinality and value range. The latter information is needed to
obtain good estimates of the selectivity of search clauses, to be used when
determining which attribute access sequence the query should choose.
However, the vertical storage structure is not well suited for the manage-
ment of metadata, since the access pattern of system-generated, runtime meta-
data lookup queries and updates is much more “horizontally” clustered than
typical user queries. In fact, the access pattern of such queries is quite similar
to that of a random sequence of simple-transaction queries being issued during
a short time interval. To a first approximation the individual items in such a
sequence can be assumed to be uncorrelated, that is, when the system brings
into memory a vertically structured MDB buffer load to access one data item
in an MDB attribute, the likelihood that it will access another data item from
the same buffer load in the near future is no greater than that of a random
hit into the attribute. Therefore, the next time a data item is needed from
the same MDB attribute, it is quite likely that a new buffer load will have to
be fetched from disk. On the other hand, the likelihood is quite high that the
next few MDB accesses will involve other properties associated with the same
MDB relation, which should favor horizontal clustering. In the first case, the
hit rate for the buffered data will be low and the performance of the system
will suffer (unless the entire column fits into the buffer and can therefore be
assumed to stay in memory permanently).
A simple solution to this “impedance mismatch” problem could be to
store the metadata as a collection of hash tables or linked lists; but if one
wants the system to be able to answer general queries that involve metadata
Search WWH ::




Custom Search