SQL/MP Query Guide

ManualsBrandsHP ManualsServerHP NonStop G-Series

191

192

193

194

195

196

197

198

199

200

Selectivity and Cost Estimates

HP NonStop SQL/MP Query Guide—524488-003

5-2

Computing Selectivity

•

Join order

The selectivity of each table helps determine the optimizer’s choice of the outer

and inner table, because it helps determine cost. Cost and available access paths

are the determining criteria for join order.

•

Types of sorts performed (in-memory user process sort or external physical sort)

The following subsections describe general selectivity computations; predicate, index,

and table selectivity; default selectivity; and join and grouping selectivity.

Computing Selectivity

The optimizer estimates selectivities based on statistics obtained prior to the

compilation of queries; therefore, SQL must have current statistics to work from. When

you specify an UPDATE [ ALL ] STATISTICS command, SQL inserts these statistics

into the COLUMNS, FILES, BASETABS, AND INDEXES catalog table.

Furthermore, the optimizer operates as if the data were uniformly distributed in each

column within the range specified by the statistics and the unique values were

uniformly distributed between the lowest and highest values.

Selectivities are cumulative; that is, the combination of predicate selectivities

determines the selectivity for a table or an index.

The optimizer uses these statistics to compute selectivities:

•

The second-high and second-low values of a column (SECONDHIGHVALUE and

SECONDLOWVALUE in the COLUMNS table). To avoid extreme values that might

be very different from the rest of the values, SQL does not use the first-high and

first-low values of a column.

•

The unique entry count (UEC), which is the number of unique values of a column

(UNIQUEENTRYCOUNT in the COLUMNS table), used to calculate the selectivity

for equal (=) and not equal (<>) comparisons, IS [NOT] NULL clauses, and join

selectivity.

SQL estimates the overall UEC by linear interpolation. For partitioned tables, the UEC

equals the total number of unique values divided by the number of partitions. Any

remainder is added to the primary partition. The quotient H2/L2 is the same for all

partitions.

By default, UPDATE STATISTICS reads partitions smaller than 1,000 blocks in their

entirety and uses sampling for partitions of 1,000 blocks or larger. You can influence

the amount of data to be examined by specifying the exact number of blocks:

SAMPLE n BLOCKS

For more information about UPDATE STATISTICS, see Section 6, Analyzing Query

Performance. For a complete description of the UPDATE STATISTICS command and

the COLUMNS catalog table, see the SQL/MP Reference Manual.