SQL/MP Query Guide

Selectivity and Cost Estimates
HP NonStop SQL/MP Query Guide524488-003
5-2
Computing Selectivity
Join order
The selectivity of each table helps determine the optimizers choice of the outer
and inner table, because it helps determine cost. Cost and available access paths
are the determining criteria for join order.
Types of sorts performed (in-memory user process sort or external physical sort)
The following subsections describe general selectivity computations; predicate, index,
and table selectivity; default selectivity; and join and grouping selectivity.
Computing Selectivity
The optimizer estimates selectivities based on statistics obtained prior to the
compilation of queries; therefore, SQL must have current statistics to work from. When
you specify an UPDATE [ ALL ] STATISTICS command, SQL inserts these statistics
into the COLUMNS, FILES, BASETABS, AND INDEXES catalog table.
Furthermore, the optimizer operates as if the data were uniformly distributed in each
column within the range specified by the statistics and the unique values were
uniformly distributed between the lowest and highest values.
Selectivities are cumulative; that is, the combination of predicate selectivities
determines the selectivity for a table or an index.
The optimizer uses these statistics to compute selectivities:
The second-high and second-low values of a column (SECONDHIGHVALUE and
SECONDLOWVALUE in the COLUMNS table). To avoid extreme values that might
be very different from the rest of the values, SQL does not use the first-high and
first-low values of a column.
The unique entry count (UEC), which is the number of unique values of a column
(UNIQUEENTRYCOUNT in the COLUMNS table), used to calculate the selectivity
for equal (=) and not equal (<>) comparisons, IS [NOT] NULL clauses, and join
selectivity.
SQL estimates the overall UEC by linear interpolation. For partitioned tables, the UEC
equals the total number of unique values divided by the number of partitions. Any
remainder is added to the primary partition. The quotient H2/L2 is the same for all
partitions.
By default, UPDATE STATISTICS reads partitions smaller than 1,000 blocks in their
entirety and uses sampling for partitions of 1,000 blocks or larger. You can influence
the amount of data to be examined by specifying the exact number of blocks:
SAMPLE n BLOCKS
For more information about UPDATE STATISTICS, see Section 6, Analyzing Query
Performance. For a complete description of the UPDATE STATISTICS command and
the COLUMNS catalog table, see the SQL/MP Reference Manual.