Database Reference
In-Depth Information
The hash join performs its operation in two phases: the build phase and the probe phase. In the most commonly
used form of hash join, the in-memory hash join, the entire build input is scanned or computed, and then a hash
table is built in memory. Each row from the outer input is inserted into a hash bucket depending on the hash value
computed for the hash key (the set of columns in the equality predicate). A hash is just a mathematical construct run
against the values in question and used for comparison purposes.
This build phase is followed by the probe phase. The entire probe input is scanned or computed one row at a
time, and for each probe row, a hash key value is computed. The corresponding hash bucket is scanned for the hash
key value from the probe input, and the matches are produced. Figure 7-8 illustrates the process of an in-memory
hash join.
Probe
phase
Start build phase
Start hash Join
Choose build input
and probe input
Build In-memory
hash table
Start probe phase
Scan build Input
Build phase
Scan probe input for
a probe-input row
Probe phase
Compute hash key for
a build Input row
Computer hash key for
the probe input row
Done
Hash bucket
for hash key
exists?
Scan corresponding hash
bucket In hash table
No
Row match found?
Create hash bucket
In hash table
Yes
Yes
Insert all rows in
build input
Produce
matched row
No
For all rows in
build input
For all rows in
probe input
Probe
phase
Done
Figure 7-8. Workflow for an in-memory hash join
The query optimizer uses hash joins to process large, unsorted, nonindexed inputs efficiently. Let's now look at
the next type of join: the merge join.
 
Search WWH ::




Custom Search