SQL/MP Query Guide

Selectivity and Cost Estimates
HP NonStop SQL/MP Query Guide524488-003
5-4
Predicate Selectivity
column >= is treated as NOT column < constant
For numeric ranges, the optimizer uses the constant to interpolate between the
SECONDLOWVALUE and SECONDHIGHVALUE obtained by UPDATE STATISTICS.
For example, if the RETAIL_PRICE column of the INVNTRY table has a
SECONDHIGHVALUE of 99 and a SECONDLOWVALUE of 2, the selectivity of the
predicate:
RETAIL_PRICE > 10
is the second-high value minus the supplied value, divided by the quantity
(SECONDHIGHVALUE minus the SECONDLOWVALUE):
99 - 10 / 99 - 2, or approximately 0.92
Ideally, this process would allow exact computation of the number of values selected
from a column that has a uniform distribution of data values. In practice, if a predicate
selects a larger proportion of records between the SECONDLOWVALUE and the
SECONDHIGHVALUE, then that predicate has a higher selectivity than one that
selects a smaller proportion of records. The optimizer cannot determine the distribution
of data. Therefore, when data is unevenly distributed, a higher selectivity does not
imply a greater number of selected rows.
Combinations of Predicates
When predicates are connected by the AND or OR operator, SQL calculates selectivity
in one of two ways. If the query contains no range predicate, then selectivity is
calculated according to rules of probability theory. For queries with range predicates, a
different calculation is used.
Selectivity When No Range Is Used
When no range is used, SQL calculates selectivity as follows:
If predicates are connected by the AND operator, their combined selectivity is
estimated to be the product of their individual selectivities. Thus, a combined
selectivity for these query equals .1 * 0.6667 or 0.06667:
WHERE empno = :hv (selectivity = 0.1)
AND SALARY < 50000 (selectivity = 0.6667)
If predicates are connected by the OR operator, their combined selectivity is
estimated as the sum of the individual selectivities minus the product of the
individual selectivities. Thus, a combined selectivity for these query equals (0.1 +
0.1 - (0.1 * 0.1)), or 0.19:
WHERE EMPNO = :hv (selectivity = 0.1)
OR SALARY < 5000 (selectivity = 0.1)
These rules rely on the assumption that the truth values of the individual predicates are
independent of one another. This estimate can become inaccurate if the predicates are
highly correlated or redundant.