SQL/MX Release 2.0 Best Practices
Possible Implementations 31
Sequential Range Partitioning
Sequential partitioning means that the partitioning and clustering key values of all subsequent data
added to a fact table are always greater than the partitioning and clustering key values in the existing
data. That is, the partition range is continually increasing.
For example, sequential partitioning occurs in a table that uses a date for the partitioning column. In the
retail fact table example, time (expressed by date) is the leading clustering key column and is also the
partitioning key. All new rows added to the table have date values greater than the existing dates in the
table. The partitioning range of this table increases continually as the dates increase.
Fact tables normally undergo an initial load, where the data collected to date is added to the table, and
subsequent loads periodically add additional data.
Consider the impact of adding new data, deleting old data, updating data, and querying existing data.
Also, consider the impact of these actions on database-management activities and backup-and-restore
operations.
In this example, all new rows added to the table always fall logically at the end of the table, since new
rows have dates later than dates in the existing rows. To achieve more even distribution, you should use
hash partitioning, hashing on the date, to balance the load.
Note that since the newly added data is clustered around the partitioning value, parallelism during a
load is limited or nonexistent. Also, a large data load may not complete within the time allotted.
Deleting old data may be as simple as dropping the oldest partitions, or using the SQL/MX software reuse-
partition facility, which allows an existing partition to be reused by assigning a new partition value,
enabling data to be rotated over a constant set of disk partitions. The reuse-partition facility is faster and
more efficient than dropping old partitions and adding new partitions to a table. However, neither
method requires queries to delete rows from the table to purge old data.
Usually, it is not necessary to update existing fact table data, because this data is historical in nature.
However, when updating is necessary, its processing will be pseudo random at best, although it can be
performed in parallel streams, processing each disk partition concurrently.
Since date is the leading key column in this example, and most DSS queries use a date predicate on
historical data, some queries seldom benefit from parallelism. The requested data is often distributed over
a few partitions or contained entirely within a single partition. This arrangement is especially common if
many dates are contained within one partition. If many users query across the same date range
concurrently, performance bottlenecks may occur on the processors and disk volumes being accessed,
and hardware resources may not be used effectively. However, if the queries specify data ranges that
are non-overlapping and diverse, this partitioning technique may be adequate.
Database-management activities are minimized when using this partitioning method. Partitions must be
sized for the expected data volume for a specific date range. Once the data is loaded and FUP RELOAD
has been run, no further management is usually needed other than purging the old data at the
appropriate time.
Secondary indexes are seldom used on fact tables because MDAM query technology often provides
efficient access to data even though the leading key columns may be excluded from the query.
However, more complex data models that require many dimension columns in fact tables use secondary
indexes. Note that LOAD and LOAD APPEND are not available in SQL/MX Release 2.0.
If you must drop and create new partitions on a regular basis, you should use the MODIFY REUSE
command to modify the index.
Tape backups must be explicitly managed with the Backup utility. However, if previously loaded data is
not updated, only the newly loaded partitions must be backed up to tape. Previous data partitions are
unaffected.
Hash Partitioning
Hash partitioning is a new function available with NonStop SQL/MX Release 2.0. The partitioning column is
selected from any of the clustering key columns of the table.