SQL/MP Query Guide

ManualsBrandsHP ManualsServerHP NonStop G-Series

121

122

123

124

125

126

127

128

129

130

Improving Query Performance Through Query

Design

HP NonStop SQL/MP Query Guide—524488-003

3-30

Hash Join

A hash join uses a hashing function, rather than indexes or sequential reads, to access

specific rows in a file. In general, the process has two phases:

1. The build phase reads the inner table of the join into virtual memory and builds an

in-memory hash table for it, using a hashing function based on join attributes.

The hashing function calculates indexes into an array of values in main memory.

All equal values go to the same table entry and are chained by linked list; other

values can also hash to the same array entry. The array is divided into buckets,

each of which is small enough to fit into main memory. If the join is partitioned, the

arrays are distributed across partitions.

Figure 3-4 shows the in-memory hash chain structure for an inner table.

2. The probe phase reads the outer table sequentially. For each outer (probing) row, it

accesses matching rows in the hashed inner table and generate result rows.

SQL supports two types of hash joins, simple and hybrid:

•

A simple hash join is fastest when the inner table of the join operation fits entirely

within memory.

•

A hybrid hash join increases performance when the inner table is much larger than

available memory.

SQL supports hashing for multiple join columns. This strategy allows efficient hashing

when more than one column is specified in the join predicate.

SQL does not support hash joins for columns that use collations.

For information on forcing a hash join, see the SQL/MP Reference Manual. Also, see

Specifying a Join Method on page 3-43.

The following subsections describe sequential and parallel operation of each type of

hash join.

Figure 3-4. Hash Function Example

Hash

Function

Hash

Buckets

Hash Chains

of Records

02 4

VST0304.vsd