Optimizing the Physical Design - Troubleshooting Oracle Performance

Database Reference

In-Depth Information

considered a small error (less than 1 second), the latter might be a bigger problem. In one case that I witnessed, a

customer's data was always generated using local standard time (without daylight saving time adjustments) and

stored directly in the database. The problems arose when, for reporting purposes, a correction for daylight saving time

had to be applied. A function designed to make a conversion between two time zones was implemented. Its signature

was the following:

new_time_dst(in_date DATE, tz1 VARCHAR2, tz2 VARCHAR2) RETURN DATE

Calling such a function once was very fast. The problem was calling it thousands of times for each report. The

response time increased by a factor of 25 as a result. Clearly, with the correct datatype, everything would be not only

faster, but also easier (the conversion would be performed automatically).

The third problem caused by wrong datatype selection is that things don't work as expected. Let's say you

have to range-partition a table, based on a DATE or TIMESTAMP column storing date and time information. This is

usually no big deal. The problem arises if the column used as the partition key contains the numeric representation

of the datetime value based on some format mask, or an equivalent string representation, instead of plain DATE or

TIMESTAMP values. If the conversion from the datetime values to the numeric values is performed with a format mask

like YYYYMMDDHH24MISS , the definition of the range partitions is still possible. However, if the conversion is based on

a format mask like DDMMYYYYHH24MISS , you have no chance of solving the problem without changing the datatype

or format of the column since the numeric (or string) order doesn't preserve the natural datetime value order (as

of version 11.1, in some cases it's possible to work around the problem by implementing virtual column based

partitioning).

The fourth problem caused by wrong datatype selection is related to the query optimizer. This is probably the

least obvious of this short list and also the one leading to the subtlest problems. The reason for this is that with the

wrong datatypes, the query optimizer will perform wrong estimates and, consequently, will choose suboptimal access

paths. Frequently, when something like that happens, most people blame the query optimizer that “once again” isn't

doing its job correctly. In reality, the problem is that information is hidden from it, so the query optimizer can't do its

job correctly. To better understand this problem, take a look at the following example, which is based on the

wrong_datatype.sql script. Here, you're checking the estimated cardinality of similar restrictions based on three

columns that store the same set of data (the date of each day in 2014) but that are based on different datatypes. As you

can see, the query optimizer is able to make a sensible estimation (the correct cardinality is 28) only for the column

that's correctly defined:

SQL> CREATE TABLE t (d DATE, n NUMBER(8), c VARCHAR2(8));

SQL> INSERT INTO t (d)

2 SELECT to_date('20140101','YYYYMMDD')+level-1

3 FROM dual

4 CONNECT BY level <= 365;

SQL> UPDATE t SET n = to_number(to_char(d,'YYYYMMDD')), c = to_char(d,'YYYYMMDD');

SQL> execute dbms_stats.gather_table_stats(ownname=>user, tabname=>'t')

SQL> SELECT * FROM t ORDER BY d;

D N C

--------- ---------- --------

01-JAN-14 20140101 20140101

02-JAN-14 20140102 20140102

...

Search WWH ::

Custom Search

Home