Database Reference
In-Depth Information
of execution and transaction processing enforcement can indeed be achieved
in database architecture.
7.4.4 Further Improvements
The original design of MonetDB had two main weaknesses. First, the reliance
on virtual memory for disk storage means that the buffer manager is removed
from the system architecture. While removing this layer makes it easier to
write ecient data processing algorithms, it means that MonetDB relies on
virtual memory advice calls to perform buffering policies. The downside is
mainly practical, that is, the implementation of such virtual memory advice
can often be incomplete or ineffective, depending on the OS (version). Fur-
thermore, virtual memory prefetching is configured at the OS kernel level and
tuned for different access patterns than those that MonetDB targets. This
often leads to I/O prefetch sizes that are too small (and thus, lower band-
width is achieved). The second main problem in the design is that the BAT
Algebra implementation follows a design of full materialization. An algebra
operator fully consumes its input BATs, producing a full-result BAT. Again,
while such loop code is simple and ecient, problems may occur if the result
arrays are large. If these are huge, which is often the case with queries on
scientific data, output flows via virtual memory to disk, and swapping may
happen, deteriorating performance. Both of these problems have been fixed in
the subsequent MonetDB/X100 system, which introduces a pipelined model
operating on small BAT pieces (vectors) and a buffer manager that can per-
form ecient asynchronous I/O. The use of a buffer manager also means that
compression techniques, which work well with vertical storage, can be ex-
ploited. Furthermore, vertically oriented compressed indexes, such as FastBit
(described in Chapter 6) can be exploited as well.
7.4.5 Assessment of the Benefits of Vertical Organization
The vertical organization of storage in MonetDB led to the achievement of
the original goal of high-performance and CPU eciency and was shown to
outpace relational competitors on many query-intensive workloads, especially
when data fits into RAM (see case study in the next subsection). Because of
the vertical data layout, it was possible to develop a series of architecture-
conscious query processing algorithms, such as for instance radix-partitioned
hash-joins and radix cluster/decluster (cache-ecient permutation). Also, pi-
oneering work in architecture-conscious cost modeling and automatic cost
calibration were done in this context.
The approach taken by MonetDB of using a front-end/back-end architec-
ture provides practical advantages as well. It is relatively easy to extend with
new modules that introduce new BAT Algebra operators. This ease can be
attributed to the direct array interface to data in MonetDB, which basically
implies that no API is needed to access data (therefore, database extenders
do not have to familiarize themselves with a complex API).
Search WWH ::




Custom Search