Optimizing Joins - Troubleshooting Oracle Performance

Database Reference

In-Depth Information

To execute merge joins, a non-negligible amount of resources may be spent on sort operations. To improve

performance, the query optimizer avoids performing sort operations whenever it saves resources. But of course, this

is possible only when the data is already sorted according to the columns used as the join condition. That happens

in two situations. The first is when an index range scan taking advantage of an index built on the columns used as

the join condition is used. The second is when a step preceding the merge join (for example, another merge join)

already sorted the data in the right order. For example, in the following execution plan, notice how table t1 is accessed

through the t1_pk index (which is built on the id column used as the join condition). As a result, for the left input, the

sort operation ( SORT JOIN ) is avoided:

----------------------------------------------

| Id | Operation | Name |

----------------------------------------------

| 0 | SELECT STATEMENT | |

| 1 | MERGE JOIN | |

|* 2 | TABLE ACCESS BY INDEX ROWID| T1 |

| 3 | INDEX FULL SCAN | T1_PK |

|* 4 | SORT JOIN | |

| 5 | TABLE ACCESS FULL | T2 |

----------------------------------------------

2 - filter("T1"."N"=19)

4 - access("T1"."ID"="T2"."T1_ID")

filter("T1"."ID"="T2"."T1_ID")

An important caveat about execution plans such as the previous one is that, because no sort operation takes

place for the left input, no work area is associated to it. As a result, there's no place to store data resulting from the left

input while the right input is executed. The processing of the previous execution plan can be summarized as follows:

A first batch of rows is extracted from the t1 table through the t1_pk index. It's essential

to understand that this first batch contains all rows only when the result set is very small.

Remember, there's no work area to temporarily store many rows.

1.

Provided the previous step returns some data, all rows in the t2 table are read through a

full scan, sorted according to the columns used as the join condition, and stored in the

work area, possibly spilling temporary data to the disk.

2.

3.

The two sets of data are joined together, and the resulting rows are returned. When the first

batch of rows extracted from the left input has been completely processed, more rows are

extracted from the t1 table, if necessary, and joined to the data of the right input already

present in a work area.

As shown in the previous execution plan, it's possible to avoid the sort associated to the left input. The same

doesn't apply to the right input, though. To explain why, the following example shows the execution plan chosen by

the query optimizer if you execute the same query as before, but this time by specifying, through the leading hint,

that table t2 must be accessed in the left input. Notice that even though the access path of the right input returns data

in the correct order, the data has to go through a SORT JOIN operation (4):

SELECT /*+ leading(t2 t1) use_merge(t1) index(t1 t1_pk) */ *

FROM t1, t2

WHERE t1.id = t2.t1_id

AND t1.n = 19

Troubleshooting Oracle Performance

Search WWH ::

Custom Search

Home