SQL/MX Release 2.0 Best Practices

Possible Implementations 34
3. Determine the effect of the UPDATE STATISTICS statement, and, optionally, back out the generated
histogram, if necessary.
a. In MXCI, back up current histogram tables, if any:
CREATE TABLE myhist LIKE HISTOGRAMS;
INSERT INTO myhist SELECT * FROM HISTOGRAMS;
CREATE TABLE myhistint LIKE HISTOGRAM_INTERVALS;
INSERT INTO myhistint SELECT * FROM HISTOGRAM_INTERVALS;
b. Issue the UPDATE STATISTICS command for the required column groups.
c. Recompile the query.
d. Use EXPLAIN to review the cost information and determine if MDAM (if desired) is used for your
query. To determine if MDAM will be used, look at the results of EXPLAIN. MDAM occurs for the
FILE_SCAN operator. When key_type indicates MDAM for the FILE_SCAN operator, MDAM query
technology is used.
e. If necessary, use the UPDATE STATISTICS CLEAR option to remove histograms for unwanted
column groups.
f. If necessary, restore the backup histogram tables.
DELETE FROM HISTOGRAMS;
INSERT INTO HISTOGRAMS
SELECT * FROM myhist where table_uid in (select object_uid
from CAT.DEFINITION_SCHEMA_VERSION_1200.OBJECTS);
DELETE FROM HISTOGRAM_INTERVALS;
INSERT INTO HISTOGRAM_INTERVALS
SELECT * FROM myhistint where table_uid in (select object_uid
from CAT.DEFINITION_SCHEMA_VERSION_1200.OBJECTS);
Dimension Table Partitioning Techniques
If you plan to use parallel execution (note that some applications are more suited to serial execution),
plan to partition all dimension tables, even if they are small. This partitioning provides the optimizer with
more opportunities to select effective parallel plans. Small tables can easily be changed to non-
partitioned tables at a later time.
To take maximum advantage of parallel index updates, put a table’s indexes on separate volumes and
on separate CPUs to eliminate contention of parallel operations on indexes serviced by the same disk
process.
If possible, take advantage of any opportunity to partition large tables on the same columns. Joins
between these tables are more efficient than if the tables are unordered relative to one another.
Some customers have found it effective to use hash-partitioning techniques with dimension tables to
eliminate data-clustering patterns that reduce the effectiveness of parallel queries. For example, if all (or
many) of the predicate values for a particular column fall within one or two partitions, and this table is the
outer table, only one ESP (or just a few) will find qualifying rows. Only this ESP will continue running the
query, with diminished parallelism.
Managing Cache Memory Size
Cache is the buffer in memory that the disk process uses to keep copies of the disk blocks that have been
accessed most recently. If the disk process finds a requested block in cache, the disk process can satisfy
the request immediately without requesting a physical I/O operation.
Cache size has an important effect on performance. The larger the cache, the more likely that a block
needs to be read only once.