SQL/MP Query Guide
Improving Query Performance Through Query
Design
HP NonStop SQL/MP Query Guide—524488-003
3-39
Writing Efficient Joins
In general, the optimizer chooses a hash join in preference to a sort merge join for
situations where an equality predicate exists for the join operation. The performance
advantage of the hash join increases with increasing size difference between input
tables. This advantage decreases when data is not distributed uniformly.
The choice between the hash join and the key-sequenced merge join is not so simple.
Both joins scan the inner table at almost the same speed. The major difference is that
the hash join spends time building or probing the hash table, while the key-sequenced
merge join spends time evaluating the join predicates against all the rows qualified by
the scan. However, the key-sequenced merge join has two main advantages:
•
The key-sequenced merge join has no space requirement for preprocessing the
inner table, while the hash join needs to allocate space for it.
•
Unlike the hash join, the key-sequenced merge join does not have to send the
hashed tables to all of the outer executor server processes. Therefore, a key-
sequenced merge join does not incur this extra message cost.
Also, uneven data distribution can lessen performance by causing differential overflow
of hash clusters; this effect can be especially influential when selectivity is low. (For
information about selectivity, see Section 5, Selectivity and Cost Estimates.)
If tables are partitioned similarly on joining columns, a parallel nested join is used in
preference to a parallel hash join.
The existence of an index on each column of a join predicate also influences the
selection of join technique.
Writing Efficient Joins
The optimizer determines efficient join strategies for various combinations of tables
and join methods (described in How the Optimizer Processes Join Operations on
page 3-24). There are five ways to assist this process:
•
Specifying predicates for smaller tables of the join
•
Using indexes for the join
•
Eliminating implicit joins
•
Adding join predicates
•
Using joins instead of subqueries
The first option minimizes the amount of data returned by the join; the other options
extend the methods available to SQL for optimizing the operation.
This subsection describes two CONTROL directives that force the optimizer to
evaluate joins in a specific way. These options should only be used with extreme
caution and thorough knowledge of join operations.
If your query requires a join, first make sure there is nothing missing from the query,
such as predicates on key columns.










