SQL/MP Query Guide

Improving Query Performance Through Query
Design
HP NonStop SQL/MP Query Guide524488-003
3-30
Hash Join
A hash join uses a hashing function, rather than indexes or sequential reads, to access
specific rows in a file. In general, the process has two phases:
1. The build phase reads the inner table of the join into virtual memory and builds an
in-memory hash table for it, using a hashing function based on join attributes.
The hashing function calculates indexes into an array of values in main memory.
All equal values go to the same table entry and are chained by linked list; other
values can also hash to the same array entry. The array is divided into buckets,
each of which is small enough to fit into main memory. If the join is partitioned, the
arrays are distributed across partitions.
Figure 3-4 shows the in-memory hash chain structure for an inner table.
2. The probe phase reads the outer table sequentially. For each outer (probing) row, it
accesses matching rows in the hashed inner table and generate result rows.
SQL supports two types of hash joins, simple and hybrid:
A simple hash join is fastest when the inner table of the join operation fits entirely
within memory.
A hybrid hash join increases performance when the inner table is much larger than
available memory.
SQL supports hashing for multiple join columns. This strategy allows efficient hashing
when more than one column is specified in the join predicate.
SQL does not support hash joins for columns that use collations.
For information on forcing a hash join, see the SQL/MP Reference Manual. Also, see
Specifying a Join Method on page 3-43.
The following subsections describe sequential and parallel operation of each type of
hash join.
Figure 3-4. Hash Function Example
Hash
Function
Hash
Buckets
Hash Chains
of Records
02 4
VST0304.vsd