Database Reference
In-Depth Information
the success of scientific computation programs in extracting eciency from
modern CPUs by expressing its calculations typically in tight loops over ar-
rays, which are well supported by compiler technology to extract maximum
performance from CPUs through techniques such as strength reduction (re-
placing an operation with an equivalent, less costly operation), array blocking
(grouping subsets of an array to increase cache locality), and loop pipelining
(mapping loops into optimized pipeline executions).
7.4.2 The Binary Association Table Algebra
The distinctive feature of MonetDB thus is the so-called Binary Association
Table (BAT) Algebra, which offers operations that work only on a handful of
BATs. The term binary association table refers to a two-column
<
surrogate,
value
table as proposed in DSM. The left column (often the surrogate of the
record identity) is called the head column, and the right column, the tail . The
BAT Algebra is closed on BATs, that is, its operators get BATs (or constants)
as parameters, and produce a BAT (or constant) result. Data in execution
is always stored in (intermediate) BATs, and even the result of a query is
a collection of BATs. Some database systems using vertical fragmentation
typically use a relational algebra that adopts the table data model, that is,
horizontal records inside the execution engine. In the implementation, this
leads to query processing strategies where relational tuples (i.e., horizontal
structures) are reconstructed early in the plan, typically in the Scan operator.
This is not the case in MonetDB; data always remain vertically fragmented.
BAT storage takes the form of two simple memory arrays, one for the head
column and one for the tail column (variable-width types are split into two
arrays: one with offsets, and the other with all concatenated data). MonetDB
allows direct access to these entire arrays by the BAT Algebra operators.
In case the relations are large, it uses memory-mapped files to store these
arrays. This was in line with the philosophy of exploiting hardware features
as much as possible. In this case, allowing array lookup as a way to locate
tuples in an entire table in effect means that MonetDB exploits the MMU
(memory management unit) hardware in a CPU to offer a very-fast O(1)
lookup mechanism by position—where the common case is that the DSM
surrogate columns correspond to positions.
As shown in Figure 7.2, MonetDB follows a front-end/back-end architec-
ture, where the front end is responsible for maintaining the illusion of data
stored in some end-user format (i.e., relational tables or objects, or XML trees
or RDF graphs in SQL, ODMG, XQuery, and SPARQL front ends, respec-
tively). In the MonetDB back end, there is no concept of relational tables (nor
concepts of objects); there are only BATs. The front ends translate end-user
queries in SQL, OQL, XQuery, and SPARQL into BAT Algebra, execute the
plan, and use the resulting BATs to present results. A core fragment of the
language is presented below:
>
Search WWH ::




Custom Search