SQL/MX Release 2.0 Best Practices

Performance Tuning 45

Performance Tuning

Improving Performance

This section discusses some of the common performance problems associated with large databases.

Additional information exists in other manuals and is not repeated here. Specifically, the SQL/MX Query

Guide contains several chapters dedicated to analyzing and improving query and database

performance, and the SQL/MX Installation and Management Guide contains chapters on measuring and

enhancing performance. Be sure to read these manuals for a thorough understanding of the material,

and to gain an understanding of the tools and techniques available for performance analysis.

When you must tune a database, you must consider several tradeoffs. For example, it is impossible to

optimize all queries or queries and batch-loading workloads. One of the first steps involved in tuning a

database is to determine the most important objectives. You also must be prepared to replicate certain

portions of the database through the use of aggregate tables, or to create secondary indexes.

Addressing Problems with Numerous ESP Processes

Large databases with tables that have many partitions can sometimes experience problems related to

the large number of active ESP processes. Relatively few active ESP processes can saturate a processor,

though how many varies with the processor technology and query workload.

Minimize Table Partitions

Although a large database almost always uses partitioned tables, try to minimize the number of table

partitions, and keep the partitions to a multiple of the number of processors. When possible, use one or

two partitions for each processor. To minimize successfully, you need to understand the growth

requirements and volatility of the table. For example, a table that is partitioned, organized, and loaded

by date would be a good candidate. Once loaded, the partition sizes would not change. A table like this

could be sized with the minimum number of partitions required to hold the data during its lifetime. In

addition, it would be advantageous to round the number of partitions to a multiple of the available

processors, and balance the data across the partitions. This would provide the most balanced and

efficient scan performance while minimizing the number of ESP processes.

Tables that are partitioned and organized differently (such as in hash partitioning) are volatile because

blocks split during random inserts. You must establish a schedule for reloading these tables and

understand the variability of the space usage. The same approach can be taken to minimize the number

of partitions.

Minimizing table partitions also minimizes the number of active ESPs used for queries, and helps eliminate

resource contention. Note that if overly busy processors are a pervasive problem, you may need

additional hardware resources. Attempting to tune an undersized system provides little gain.

Recommended Maximum Number of Partitions

HP recommends that you limit the number of partitions in an SQL/MX table or index to no more than 512.

If you exceed this recommended limit, you may get an MXCMP internal error because of a shortage of

virtual memory space.

Addressing Problems with Skewed Data Distributions

Tables sometimes contain columns that have significantly skewed values. For example, a customer had a

date column with a range that exceeded 100 years, although the overwhelming majority of dates

occurred within the last several years, and the current time period was the most often queried. Since the

optimizer assumes an even distribution, a range predicate of the form “DATE between “1999-01-01” and

“1999-12-31” caused the optimizer to expect that fewer than 1 percent of the table’s rows would qualify

for the predicate. In fact, closer to 20 percent of the table’s rows qualified for the predicate. In many

queries, the optimizer often expected only about 100,000 rows from a table when several tens of millions

actually qualified.