SQL/MP Query Guide
Improving Query Performance Through Query
Design
HP NonStop SQL/MP Query Guide—524488-003
3-31
Hash Join
Simple Sequential Hash Join
To execute a simple sequential hash join, the executor performs two steps.
1. The executor makes a single pass over the inner table of the join, applying
selection and projection operations (if possible) to eliminate unnecessary rows and
columns of the inner table prior to the actual join operation. The executor stores
the remaining portion of the inner table in memory, building an in-memory hash
table using the smaller table hashed on the join attribute. (The join attributes are
used as a hash key.)
2. The executor then makes one pass over the outer table of the join, applying
selection and projection operations if possible. For each row of the outer table that
remains after selection and projection, the executor obtains the corresponding row
or rows of the inner table (as in a nested join). The executor joins matching rows
and delivers them to the next stage of the query execution process.
Instead of sorting, the inner table is read into a memory-resident hash table. The outer
table is processed like the outer table of a nested join. Building a hash table is faster
than sorting, so hash joins have better performance as long as enough memory is
available.
The order of the outer table is preserved during a simple sequential hash-join
operation.
Hybrid Hash Join
A hybrid hash join handles overflow situations when the smaller table does not fit
entirely in physical memory. The hybrid join retains as much as possible of the inner
table in memory, but divides rows of both inner and outer tables into buckets, or
clusters, that reside on disk and can be processed in memory.
If the outer table is actually a composite of more than one table, SQL must write it
entirely to disk and complete the previous join before it can begin the current hybrid
hash join. This is an expensive operation.
The hybrid hash join does not preserve ordering, because clusters of data might be
written to disk. Each cluster is in order, but the concatenation of them is not in order.
This situation affects performance when the query requests subsequent orderings such
as an ORDER BY request.