Database Reference
In-Depth Information
replicated tables. For example, if all the columns in a query are indexed, the
query can be substantially sped up by scanning the shorter index records in-
stead of touching the wide records of the main table. We illustrate this with
the next query example. It extracts celestial objects that are low-z quasar
candidates, a property specified through correlations between the objects'
magnitudes in different color bands (query SX11 in Gray et al. 58 ).
SELECT g, run, rerun, camcol, field, objID
}
FROM Galaxy
WHERE ( (g < = 22)
and (u
g > =
0.27) and (u
g < 0.71)
and (g
r > =
0.24) and (g
r < 0.35)
and (r
i > =
0.27) and (r
i < 0.57)
z < 0.70) )
The query predicates do not allow ecient index search of the qualifying
rows; instead scanning of all the rows is needed. However, a full table scan
can be avoided using available indexes that contain all the necessary columns.
The data volume transferred for the 150 GB dataset is 1.8 GB, a substantial
reduction with respect to the full table scan, but still twice as large as the
850 MB transferred in MonetDB for the same query. The reason is that the
indexes chosen for the query execution contain several additional columns
irrelevant for this query.
and (i
z > =
0.35) and (i
7.5.3 Improved Performance
In addition to the ecient vertical access pattern, MonetDB employs a number
of techniques to provide high performance for analytical applications. Among
these are runtime optimization, such as choosing the best algorithm fitting
the argument properties, and ecient cache-conscious algorithms exploiting
modern computer architecture. To demonstrate the net effect of these tech-
niques on the performance experienced by the end user, we performed a few
experiments with the above table- and index-scan queries against both the
1.5 GB and 150 GB datasets. The elapsed times in seconds are shown in
Table 7.1. The performance of the vertical database for index-supported
queries is comparable for the small dataset, and 30% better for the large
dataset. Queries involving full table scans are sped up by a factor of 5 for the
large dataset.
TABLE 7.1 Elapsed times in seconds for two types of queries against
a “small” (1.5 GB) and a “large” (150 GB) dataset
Table Scan
Index Scan
Table Scan
Index Scan
1.5 GB
1.5 GB
150 GB
150 GB
Row-store
6.6
0.4
245
24
Column-store
0.4
0.47
53
16
Search WWH ::




Custom Search