Database Reference
In-Depth Information
appear in the join expression. The result of this phase is another collection
of temporary index lists indicating which tuples in each conceptual relation
satisfy the query. Since a join index clustered on the desired TID exists for all
entity-based equi-joins, a full scan can always be avoided. During the value
materialization phase several independent joins are evaluated, preferably in
parallel. The join operands are small binary relations containing only TIDs.
The final composition phase executes an m -way merge join, which permits a
large degree of parallelism. Its operands are all small binary relations contain-
ing only TID lists whose cardinality has been maximally reduced due to the
select operations.
The practical conclusions from this work, reported in Valduriez et al. 25 and
cited in Khoshafian et al., 23 are (1) that DSM with join indexes provides better
retrieval performance than NSM when the number of retrieved attributes is
low or the number of retrieved records is medium to high, but NSM provides
better retrieval performance when the number of retrieved attributes is high
and the number of retrieved records is low; and (2) that the performance of
single attribute modification is the same for both DSM and NSM, but NSM
provides better record insert/delete performance.
This approach is similar to those used in MonetDB 17 , 26 and in Cantor, 27
with the following main differences: (1) DSM provides two predefined join in-
dexes for each attribute, one clustered on each of the two attributes (attribute
value, TID), while Cantor and MonetDB both use indexes that are created
as needed during query evaluation; (2) Cantor stores these indices using RLE
compression; MonetDB introduces a novel radix cluster algorithm for hash
join; (3) although potentially important, parallelism has not been presented
as a key design issue for MonetDB, nor was it one for Cantor; (4) the algo-
rithms used in MonetDB and Cantor were both presented as simple two-way
joins, corresponding mainly to the composition phase in the DSM algorithm,
which is presented as an m-way join.
7.2.2 The Impact of Modern Processor Architectures
Research has shown that DBMS performance may be strongly affected by
“cache misses” 28 and can be much improved by use of cache-conscious data
structures, including column-wise storage layouts such as DSM and within-
page vertical partitioning techniques. 29 In Ailamaki et al. 28 (p. 266) this ob-
servation is summarized as follows: “Due to the sophisticated techniques used
for hiding I/O latency and the complexity of modern database applications,
DBMSs are becoming compute and memory bound.” In Boncz et al., 30 it is
noted that past research on main-memory databases has shown that main-
memory execution needs different optimization criteria than those used in
I/O-dominated systems.
On the other hand, it was a common goal early on for scientific database
management systems (SDBMS) development projects to exploit the superior
CPU power of computers used for scientific applications, which in the early
Search WWH ::




Custom Search