SQL/MX Release 2.0 Best Practices

Performance Tuning 45
Performance Tuning
Improving Performance
This section discusses some of the common performance problems associated with large databases.
Additional information exists in other manuals and is not repeated here. Specifically, the SQL/MX Query
Guide contains several chapters dedicated to analyzing and improving query and database
performance, and the SQL/MX Installation and Management Guide contains chapters on measuring and
enhancing performance. Be sure to read these manuals for a thorough understanding of the material,
and to gain an understanding of the tools and techniques available for performance analysis.
When you must tune a database, you must consider several tradeoffs. For example, it is impossible to
optimize all queries or queries and batch-loading workloads. One of the first steps involved in tuning a
database is to determine the most important objectives. You also must be prepared to replicate certain
portions of the database through the use of aggregate tables, or to create secondary indexes.
Addressing Problems with Numerous ESP Processes
Large databases with tables that have many partitions can sometimes experience problems related to
the large number of active ESP processes. Relatively few active ESP processes can saturate a processor,
though how many varies with the processor technology and query workload.
Minimize Table Partitions
Although a large database almost always uses partitioned tables, try to minimize the number of table
partitions, and keep the partitions to a multiple of the number of processors. When possible, use one or
two partitions for each processor. To minimize successfully, you need to understand the growth
requirements and volatility of the table. For example, a table that is partitioned, organized, and loaded
by date would be a good candidate. Once loaded, the partition sizes would not change. A table like this
could be sized with the minimum number of partitions required to hold the data during its lifetime. In
addition, it would be advantageous to round the number of partitions to a multiple of the available
processors, and balance the data across the partitions. This would provide the most balanced and
efficient scan performance while minimizing the number of ESP processes.
Tables that are partitioned and organized differently (such as in hash partitioning) are volatile because
blocks split during random inserts. You must establish a schedule for reloading these tables and
understand the variability of the space usage. The same approach can be taken to minimize the number
of partitions.
Minimizing table partitions also minimizes the number of active ESPs used for queries, and helps eliminate
resource contention. Note that if overly busy processors are a pervasive problem, you may need
additional hardware resources. Attempting to tune an undersized system provides little gain.
Recommended Maximum Number of Partitions
HP recommends that you limit the number of partitions in an SQL/MX table or index to no more than 512.
If you exceed this recommended limit, you may get an MXCMP internal error because of a shortage of
virtual memory space.
Addressing Problems with Skewed Data Distributions
Tables sometimes contain columns that have significantly skewed values. For example, a customer had a
date column with a range that exceeded 100 years, although the overwhelming majority of dates
occurred within the last several years, and the current time period was the most often queried. Since the
optimizer assumes an even distribution, a range predicate of the form “DATE between “1999-01-01” and
“1999-12-31” caused the optimizer to expect that fewer than 1 percent of the table’s rows would qualify
for the predicate. In fact, closer to 20 percent of the table’s rows qualified for the predicate. In many
queries, the optimizer often expected only about 100,000 rows from a table when several tens of millions
actually qualified.