SQL/MX 2.x Query Guide (H06.04+, J06.03+)
Keeping Statistics Current
HP NonStop SQL/MX Query Guide—540437-005
3-5
Sampling and UPDATE STATISTICS
Analyzing the Possible Impact of Updating Statistics
Depending on the size of the table, updating statistics can take longer than you would
like. Consider updating statistics during the hours when peak performance is not
required.
If you want to preserve the existing query execution plan, be aware that updating
statistics might cause the optimizer to choose a different plan. Usually, you can
improve performance by updating the statistics on a table to reflect the current status.
The update statistics operations, however, might not improve performance, as
discussed next:
•
You can update statistics to perform a sampling of rows to determine the statistical
information. Depending on your sample size, this procedure could take a long time.
(That is, the larger the sample, the longer it might take.) Because this is a statistical
sampling method, the statistics gathered are not exact. Inaccurate statistics (wrong
by more than 10 percent) can adversely affect the plan. See Sampling and
UPDATE STATISTICS.
•
Because the UPDATE STATISTICS statement does not automatically recompile
programs, the operation does not invalidate the dependent programs. If you want
to take advantage of the new statistics, however, you must explicitly recompile the
dependent programs.
Sampling and UPDATE STATISTICS
Use sampling to control the amount of time spent calculating statistics. If you do not
specify sampling, statistics are collected by scanning the entire table. The optional
SAMPLE clause provides several methods of producing a sample set, based on a
ratio, to determine the histograms. The SQL/MX Reference Manual describes each
sampling method in detail.
You might want to use sampling because of the amount of time required to update
statistics on the entire table. Sampling techniques that do not perform full large table
scans result in significant performance gain. You can use SAMPLE without specifying
the number of rows. By default, NonStop SQL/MX samples two percent of the table up
to a maximum of two million rows. You might want to explicitly specify a larger sample
size to increase statistics accuracy or a smaller sample size to reduce the running
time.
When you use the SAMPLE option with UPDATE STATISTICS, a temporary table is
created. You can use the HIST_SCRATCH_VOL default attribute for SQL/MX and
SQL/MP tables. For SQL/MX tables, NonStop SQL/MX creates a single partition or
hash partitions across multiple volumes as specified in the HIST_SCRATCH_VOL
default attribute. NonStop SQL/MX determines how many partitions are needed based
on the sample set retrieved by the SAMPLE option. It randomly selects the volume or
volumes to use from the list of volumes specified in the HIST_SCRATCH_VOL default
attribute. NonStop SQL/MX creates only as many partitions as specified with
HIST_SCRATCH_VOL. If the default attribute is not used, the temporary table is










