Emerging Database Systems in Support of Scientific Data - Scientific Data Management

Database Reference

In-Depth Information

bottleneck so there is very limited capacity for a shared-memory system to

scale.

In a shared-disk architecture, there are a number of independent processor

nodes, each with its own memory. Such architectures also have a number of

drawbacks that limit scalability. The interconnection network that connects

each processor to the shared-disk subsystem can become a bottleneck. Since

there is no pool of memory that is shared by the processors, there is no

obvious place for the lock table or buffer pool to reside. To set locks, one must

either centralize the lock manager on one processor or introduce a distributed

locking protocol. Both are likely to become bottlenecks as the system is scaled

up.

In a shared-nothing approach, each processor has its own set of disks. Ev-

ery node maintains its own lock table and buffer pool, eliminating the need

for complicated locking and consistency mechanisms. Data are “horizontally

partitioned” across nodes, such that each node has a subset of the rows (and

in vertical databases, maybe also a subset of the columns) from each big ta-

ble in the database. According to these authors, shared-nothing is generally

regarded as the best-scaling architecture (see also Dewitt and Gray 47 ).

7.3 Two Contemporary Vertical Database Systems:

MonetDB and C-Store

We give next a brief overview of two recently developed vertical database

systems to contrast their styles.

7.3.1 MonetDB

MonetDB 48 , 49 uses the DSM storage model. A commonly perceived drawback

of the DSM is that queries must spend “tremendous additional time” doing

extra joins to recombine fragmented data. This was, for example, explicitly

claimed Ailamaki et al. 29 in p. 169. According to Boncz and Kersten, 30 for

this reason the DSM was for a long time not taken seriously by the database

research community. However, as these authors observe (and as was known

and exploited long ago 16 , 27 vertical fragments of the same table contain differ-

ent attribute values from identical tuple sequences; and if the join operator is

aware of this, it does not need to spend significant effort on finding matching

tuples. MonetDB maintains fragmentation information as properties (meta-

data) on each binary association table and propagates these across operations.

The choice of algorithms is typically deferred until runtime and is done on the

basis of such properties.

Search WWH ::

Custom Search

Home