SQL/MP Query Guide

ManualsBrandsHP ManualsServerHP Integrity NonStop J-Series

131

132

133

134

135

136

137

138

139

140

Improving Query Performance Through Query

Design

HP NonStop SQL/MP Query Guide—524488-003

3-39

Writing Efficient Joins

In general, the optimizer chooses a hash join in preference to a sort merge join for

situations where an equality predicate exists for the join operation. The performance

advantage of the hash join increases with increasing size difference between input

tables. This advantage decreases when data is not distributed uniformly.

The choice between the hash join and the key-sequenced merge join is not so simple.

Both joins scan the inner table at almost the same speed. The major difference is that

the hash join spends time building or probing the hash table, while the key-sequenced

merge join spends time evaluating the join predicates against all the rows qualified by

the scan. However, the key-sequenced merge join has two main advantages:

•

The key-sequenced merge join has no space requirement for preprocessing the

inner table, while the hash join needs to allocate space for it.

•

Unlike the hash join, the key-sequenced merge join does not have to send the

hashed tables to all of the outer executor server processes. Therefore, a key-

sequenced merge join does not incur this extra message cost.

Also, uneven data distribution can lessen performance by causing differential overflow

of hash clusters; this effect can be especially influential when selectivity is low. (For

information about selectivity, see Section 5, Selectivity and Cost Estimates.)

If tables are partitioned similarly on joining columns, a parallel nested join is used in

preference to a parallel hash join.

The existence of an index on each column of a join predicate also influences the

selection of join technique.

Writing Efficient Joins

The optimizer determines efficient join strategies for various combinations of tables

and join methods (described in How the Optimizer Processes Join Operations on

page 3-24). There are five ways to assist this process:

•

Specifying predicates for smaller tables of the join

•

Using indexes for the join

•

Eliminating implicit joins

•

Adding join predicates

•

Using joins instead of subqueries

The first option minimizes the amount of data returned by the join; the other options

extend the methods available to SQL for optimizing the operation.

This subsection describes two CONTROL directives that force the optimizer to

evaluate joins in a specific way. These options should only be used with extreme

caution and thorough knowledge of join operations.

If your query requires a join, first make sure there is nothing missing from the query,

such as predicates on key columns.