Database Reference
In-Depth Information
P 1
P 2
P 3
P 200
Object ID
0
P 1 P 2 P 3
P 200
V 0,1
V 0,2
V 0,3
0
1
V 0,1
V 1,1
V 2,1
V 0,2
V 1,2
V 2,2
V 0,3
V 1,3
V 2,3
V 0,1
V 0,2
V 0,3
1
2
V 1,1
V 1,2
V 1,3
V 1,1
V 1,2
V 1,3
2
.
.
.
.
.
.
.
.
.
.
10 8
10 8
.
.
.
.
.
.
10 9
10 9
(a)
(b)
(c)
Figure 7.1
Horizontal vs. vertical organization of tabular (relational) data.
shown in Figure 7.1(a). A horizontal organization of the table simply means
that the physical layout of the data is row-wise, one row following its prede-
cessor, as shown in Figure 7.1(b). Usually the entire table is stored into disk
pages or files, each containing multiple rows. A vertical organization means
that the layout of the data is column-wise as shown in Figure 7.1(c). Note
that the entire column containing a billion values is usually stored in multiple
disk pages or multiple files.
Suppose that a user wishes to get the event IDs that have energy, E, greater
than 10 MeV (million electron-volts) and that have number of pions, N p ,be-
tween 100 and 200, where pion is a specific type of particle. This predicate can
be written as: ((E
>
\
<
<
200)). It is obvious that in this case
searching over the vertically organized data is likely to be faster, since only
the data in the two columns for E and N p have to be brought into memory
and searched. In contrast, the horizontal organization will require reading the
entire table. Given this simple observation, why were relational database sys-
tems typically built with a horizontal organization? As will be discussed next,
the majority of database systems were designed for transaction processing,
where frequent updates of randomly requested rows were expected, which is
the reason for choosing the horizontal organization. In this chapter we dis-
cuss the class of applications that benefit greatly from a vertical organization,
which includes most scientific data applications.
10)
(100
N p
7.1.2 Design Rules and User Needs
A recent contribution to the literature on database design is a description of
the design rationale for Sybase IQ Multiplex, what its authors call complex
analytics 1 : a parallel, multi-node, shared-storage vertical database system 2
whose major design goal is to eciently manage large-scale data warehousing
Search WWH ::




Custom Search