Optimizing the Physical Design - Troubleshooting Oracle Performance

Database Reference

In-Depth Information

In practice, the overhead related to the position of the columns is (more) noticeable in one of the

following situations:

•

When tables have many columns, and SQL statements frequently reference very few of the

ones located near the end of the row.

•

When many rows are read from a single block, such as during a full table scan. This is

because more often than not, when accessing few rows per block, the overhead for locating

and accessing a block is by far more significant than the one for locating and accessing the

columns if few rows are read. For example, if I run the column_order.sql script by setting

PCTFREE to 90 (hence I decrease the number of rows per block), the query referencing the first

column performs less than two times faster than the query referencing the 250th column

(as shown in Figure 16-2 , it's about five times faster with PCTFREE set to 10).

Since trailing NULL values aren't stored, it makes sense to place columns expected to contain NULL values at the

end of the table. In this way, the number of physically stored columns and consequently the average size of the rows

might decrease.

Optimal Datatype

As briefly described in Chapter 1, specifically in the “Designing for Performance” section, in recent years I have

witnessed a worrying trend in physical design that I call wrong datatype selection . At first glance, choosing the

datatype for a column seems like a very straightforward decision to make. Nevertheless, in a world where software

architects spend a lot of time discussing high-level things such as agile software development, SOA, or persistence

frameworks, most people seem to forget about low-level ones. I'm convinced it's essential to get back to the basics and

discuss why datatype selection is important.

Pitfalls in Datatype Selection

To illustrate wrong datatype selection, I present five examples of typical problems that I have encountered over and

over again.

The first problem caused by wrong datatype selection is wrong or lacking validation of data when it's inserted or

modified in the database. For example, if a column is supposed to store numeric values, choosing a character string

datatype for it calls for an external validation. In other words, the database engine isn't able to validate the data. It

leaves it to the application to do. Even if such a validation is easy to implement, bear in mind that every time the same

piece of code is spread to several locations, instead of being centralized in the database, sooner or later there will be

a mismatch in functionality (typically, in some locations the validation may be forgotten, or maybe the validation

changes later and its implementation is updated only in some locations). The example I'm presenting is related to

the nls_numeric_characters initialization parameter. Remember that this initialization parameter specifies the

characters used as decimals and group separators. For example, in Switzerland it's usually set to “.,”, and therefore the

value pi is formatted as follows: 3.14159. Instead, in Germany it's commonly set to “,.”, and therefore the same value

is formatted as follows: 3,14159. Sooner or later, running an application with different client-side settings of this

initialization parameter will cause an ORA-01722: invalid number error if conversions from VARCHAR2 to NUMBER

take place because of using a wrong datatype in the database. And by the time you notice this, your database will be

filled with VARCHAR2 columns containing both formats, and therefore a painful data reconciliation will have to

be performed.

The second problem caused by wrong datatype selection is loss of information. In other words, during the

conversion of the original (correct) datatype to the database datatype, information gets lost. For example, imagine

what happens when the date and time of an event is stored with a DATE datatype instead of a TIMESTAMP WITH TIME

ZONE datatype. Fractional seconds and time zone information get lost. Although the former leads to what could be

Troubleshooting Oracle Performance

Search WWH ::

Custom Search

Home