SQL/MX Release 2.0 Best Practices

Possible Implementations 32
Hash partitioning uses a hashing key to randomly distribute the data. New rows are inserted within existing
cluster key order within the volume, causing table fragmentation as existing blocks are split to make room
for newly added rows. Online reloads are required periodically to defragment all partitions and maintain
adequate scan performance. Note that all partitions become fragmented as new data is inserted into
the table across the entire partitioning range.
This type of partitioning is not difficult to plan. The partition sizes are balanced, and individual partitions
are not unexpectedly filled, provided that the unique entry count of the hash partitioning key is greater
than 50 times the number of partitioning volumes.
However, delete processing of old data is complex, and must be accomplished through the use of
queries. Scans of each partition can occur in parallel; however, each partition must be entirely scanned
to locate old rows for purging.
Depending on the number of unique values for the clustering key columns that precede the search
predicate (i.e. date column), you could use MDAM query technology processing to skip ranges of
unaffected rows. Be aware that if delete processing is done through the MXCI utility (or other query tools),
a transaction based on NonStop TM/MP software will exist for the duration of the query, and may be
aborted due to the NonStop TM/MP software transaction timer. However, you can alter this timer before
and after running the query.
Delete processing will acquire row locks that might escalate into partition or table locks, preventing other
transactional access to the data. Managing delete processing programmatically gives you more control
over the granularity of locks placed on the data. Make sure you understand the consequences of either
technique.
Update processing requirements are similar to those of delete processing.
Query processing is more likely to use parallel execution, especially if the high-order clustering column is
omitted from the list of predicates. In this case, you might use MDAM query technology to avoid full
partition scans. Hash partitioning structures the data in each partition based on the clustering key which is
suitable for query processing. It can take advantage of MDAM query technology processing capabilities.
The impact of deleting and updating with hash partitioning is greater than that of sequential partitioning
on database management.
You must schedule online reloads to run periodically to eliminate fragmentation and maintain good scan
performance. All partitions may be online reloads because data will be inserted to every partition of the
table. Hash data partitions must be managed carefully to avoid partition-full conditions, and to maintain
the balance of data across partitions. You might need to run the online partition-management facilities
periodically to rebalance the partitions.
Hash partitioning supports secondary indexes, if they are required on the fact table, because the data is
added using insert processing.
Backup requirements also are complex. Since a load adds rows to all partitions, the entire table must be
backed up. For small to medium-size tables, backing up the entire table or taking NonStop TM/MP online
dumps, and completing within the availability window, may be adequate. For larger tables or
applications that load data frequently or continuously, keeping copies of the original source data and
reloading it when necessary may be more effective.
Sequential range partitioning and hash partitioning each have advantages. Sequential partitioning
preserves table compactness and minimizes database maintenancethere is no fragmentationwhile
hash partitioning can evenly distribute rows across all partitions, reducing bottlenecks by increasing
parallelism.
When queries access the data from the fact table, MDAM query technology probes for all values of the
hash key, one for each partition, which is efficient for query processing. Queries are more likely to exploit
the capabilities of the inherent parallelism. Additionally, a view can be constructed in place of the fact
table, which forces a join between the partition table and the fact table on the hash key. This method
incurs the cost of an additional join, but eliminates MDAM query technology processing, because the
hash-key values can be obtained from the dimension table. This method has been shown to produce
better results when multiple ESPs all begin MDAM query technology probing of the fact table at the same