Databases Reference
In-Depth Information
Join Orders
Join ordering is one of the most complex problems in query optimization, and one that
has been the subject of extensive research since the seventies. It refers to the process of
calculating the optimal join order, that is, the order in which the necessary tables are
joined, when executing a query. As suggested in the ongoing challenges briefly discussed
earlier, join ordering is directly related to the size of the search space, as the number of
possible plans for a query grows very rapidly, depending on the number of tables joined.
A join combines records from two tables based on some common information, and
the predicate which defines which columns are used to join the tables is called a join
predicate. A join works with only two tables at a time, so a query requesting data from
n tables must be executed as a sequence of n - 1 joins, but it should be noted that the first
join does not have to be completed before the next join can be started. Because the
order of joins is a key factor in controlling the amount of data flowing between each
operator in the execution plan, it's a factor which the Query Optimizer needs to pay
close attention to.
Specifically, the Query Optimizer needs to make two important decisions regarding joins:
• the selection of a join order
• the choice of a join algorithm.
In this section I'll talk about join orders but, since the implementation of join algorithms
is part of the execution engine, selecting a join algorithm will be explained in Chapter 2 ,
The Execution Engine . Join order is, strictly speaking, a separate concern from the
algorithms provided by the execution engine, so I'll give an overview of the former here.
As mentioned, the order in which the tables are joined determines the cost and
performance of a query. Although the results of the query are the same, regardless
of the join order, the access cost of each different join order can vary dramatically.
Search WWH ::




Custom Search