Database Reference
In-Depth Information
70s could be orders of magnitude higher than those of processors designed
for commercial workloads. The main purpose was to make query evaluation
compute and memory bound rather than I/O bound whenever possible.
The MonetDB developers 17 , 26 , 31 have conducted thorough analyses of the
effect of modern computer hardware architectures on database performance.
As advances in CPU speed far outpace advances in dynamic random access
memory (DRAM) latency, the effect of optimal use of the memory caches is
becoming ever more important. In Manegold et al. 17 a detailed discussion is
presented of the impact of modern computer architectures, in particular with
respect to their use of multilevel cache memories to alleviate the continually
widening gap between DRAM and CPU speeds that has been a characteris-
tic for computer hardware evolution since the late 70s. Memory access speed
has stayed almost constant (within a factor of 2), while CPU speed has in-
creased by almost a factor of 1,000 from 1979 to 1999. Cache memories, which
have been introduced on several levels to reduce memory latency, can do so
effectively only when the requested data are found in the cache.
Manegold et al. 17 claim that it is no longer appropriate to think of the
main memory of a computer system as “random access” memory, and show
that accessing data sequentially also in main memory may provide significant
performance advantages. They furthermore show that, unless special care is
taken, a database server running even a simple sequential scan on a table may
spend 95% of its cycles waiting for memory to be accessed. This memory-
access bottleneck is even more dicult to avoid in more complex database
operations such as sorting, aggregation, and join, which exhibit a random
access pattern. The performance advantages of exploiting sequential data ac-
cess patterns during query processing have thus become progressively more
significant as faster processor hardware has become available.
Based on results from a detailed analytical cost model, Manegold et al. 17
discuss the consequences of this bottleneck for data structures and algorithms
to be used in database systems and identify vertical fragmentation as the
storage layout that leads to optimal memory cache usage.
A key tool whose utilization the MonetDB developers pioneered in database
performance research is the use of detailed access cost models based on input
from hardware event counters that are available in modern CPUs. Use of such
models has enabled them, among other things, to identify a significant bottle-
neck in the implementation of the partitioned hash-join and hence to improve
it using perfect hashing . Another contribution is their creation of a calibra-
tion tool , which allows relevant performance characteristics (cache sizes, cache
line sizes, cache miss latencies) of the cache memory system to be extracted
from the operating system for use in cost models, in order to predict the per-
formance of, and to automatically tune, memory-conscious query processing
algorithms on any standard processor.
It is the experience of the MonetDB developers that virtual-memory ad-
vice on modern operating systems can be effectively utilized in a way that
makes a single-level storage software architecture approach feasible. Thus, the
Search WWH ::




Custom Search