Database Reference
In-Depth Information
The essential thing to understand in this format is that the database engine doesn't know the offset of the
columns in a row. For example, if it has to locate column 3, it has to start by locating column 1 (that's simple, since
the length of the header is known). Then, based on the length of column 1, it locates column 2. Finally, based on
the length of column 2, it locates column 3. So whenever a row has more than a few columns, a column near the
beginning of the row might be located much faster than a column near the end of the row. To better understand this,
you can perform the following test, based on the column_order.sql script, to measure the overhead associated with
the search of a column:
1.
Create a table with 250 columns:
CREATE TABLE t (n1 NUMBER, n2 NUMBER, ..., n249 NUMBER, n250 NUMBER)
2.
Insert 10,000 rows. Every column of every row stores the same value.
3.
Measure the response time for the following query, executed 1,000 times in a loop,
for each column:
SELECT count(<col>) FROM t
Figure 16-2 summarizes the results of this test run on my test server. It's important to note that the query
referencing the first column (position 1) performs about five times faster than the query referencing the 250th column
(position 250). This is because the database engine optimizes every access and thus avoids locating and reading
columns that aren't necessary for the processing. For example, the query SELECT count(n3) FROM t stops walking the
row when the third column is located. Figure 16-2 also reports, at position 0, the figure for count(*) , which doesn't
need to access any column at all.
Figure 16-2. The position of a column in a row vs. the amount of processing needed to access it
Because of this, the general rule is to place intensively accessed columns first. However, in order to take
advantage of this, you should be careful to access (reference) only the columns that are really needed. In any case,
from a performance point of view, selecting not-needed columns (or worse, as it's sadly very often done, referencing
all columns using a SELECT * even if only some of them are actually needed by the application) is bad, not only
because there's an overhead when reading them from blocks, as you have just seen, but also because more memory is
Search WWH ::




Custom Search