SQL/MX Release 2.0 Best Practices

Possible Implementations 30
Partitioning of tables and indexes is required if you want to take advantage of parallelism. HP
recommends that you limit the number of partitions used, because processes perform better when they
access fewer drives. This arrangement is different from the traditional SQL/MP method of database
design. Normally, distributing partitions across many drives increases performance.
Fact Table Partitioning Techniques
Fact tables are often very large and, therefore, must be partitioned across multiple disk volumes.
To obtain good parallel scan performance, the partitions must be distributed over all processors. Ideally,
the number of table partitions should be a multiple of the number of processors, and the partitions should
be equal in size. Otherwise, scan performance is uneven and is dominated by the largest partition.
To avoid controller I/O contention, place multiple partitions within the same processor on disk volumes
attached to different controllers, if possible.
You can partition tables in three ways. Each method relies on the standard range partitioning or hash
partitioning capabilities of SQL/MX software. You must fully understand the advantages and
disadvantages of each method at design time. The specific method used affects the load architecture,
query performance, and degree of database maintenance.
To choose an effective partitioning method, you must first understand the nature of the data and the
type of processing that will occur, both in batch and query.
Consider these questions when you decide which column or columns to use for partitioning and the order
of the key columns:
What is the expected rate and frequency of loads?
What is the time window for the loads?
Are data loads done in batch or online?
Will new rows be added in sequential order or in random order?
After the data is loaded, will it be processed in batch or by queries?
Does the partitioning method aid the order in which the data is accessed?
How often will queries supply predicates for the key columns, including the partitioning column?
What is the data distribution and unique value count of each key column?
Which key columns are likely to enable MDAM query technology processing?
A fact table is the central part of the star schema, often the schema of choice for databases. The
granularity of the fact table is determined by the level of detail in the dimension tables. Each dimension is
represented in the fact table by a dimensional value, which becomes one of the fact table key columns.
The number of dimensions usually equals the number of fact table key columns. The primary key of the
fact table is composed of the collection of dimension columns and uniquely identifies each row. Each
fact table key column is a foreign key to a corresponding dimension table.
For example, a simple retail fact table may consist of three dimensions and has a grain of time, location,
and product. Each dimension is represented in the fact table as an individual key column, and each row
of the fact table includes all three dimensions.
If your fact table does not possess these characteristicsthat is, if there are key columns in the fact table
that do not reference dimension tables, or if there are data columns in the fact table that are foreign
keys to dimension tablesreview your design’s dimensional modeling techniques.
In standard range partitioning, the leading column or columns partition the table horizontally. For
example, if the chosen key order is location, time, and product, you could use location for the
partitioning column. You would order the table data by location and distribute it across all disk volumes
based on specific partitioning values.