SQL/MX Release 2.0 Best Practices
Design Guidelines and Considerations  14 
default, the clustering key is the primary key, but a separate clustering key can be specified with the 
STORE BY clause.  
When a primary key is specified for a table, the clustering key can comprise all or some of the columns of 
the primary key. If only some of the primary key columns are used for clustering, you must use the leading 
columns of the primary key in the STORE BY clause. When all or some of the primary key columns are used 
in the STORE BY clause, the data is clustered by the primary key. In this situation, the primary key becomes 
the clustering key.  
When a primary key contains the DROPPABLE clause, the columns of the primary key can be updatable. 
When the primary key is droppable, the columns in the STORE BY clause can be different from the primary 
key columns. Data will be clustered by the STORE BY columns. In this situation, the primary key is not the 
clustering key. To enforce uniqueness of the clustering key, a SYSKEY is appended to the end of the 
clustering key. The b-tree organization of the table is based on the clustering key. The primary key is 
maintained in a separate index, generated by NonStop SQL/MX. This generated index will be a single 
partitioned index and will physically reside on the volume of the first partition for the table. Using the 
MODIFY utility, you can alter the primary key index to move the location of this partition and add 
additional partitions based on range partitioning. Use the SHOWDDL command to determine the name 
of the SQL/MX generated primary key index. 
NonStop SQL/MX supports two types of partitioning types: range partitioning and hash partitioning. Range 
partitioning places rows into different disk volumes based on the range partitioning key. Hash partitioning 
places rows into different disk volumes based on an algorithm of the hash key. The partitioning key must 
be the clustering key or part of the clustering key. If the clustering key has a SYSKEY appended to it to 
provide uniqueness when the clustering key is not the primary key, the SYSKEY portion is not used for 
partitioning. In other words, the SYSKEY is dropped when partitioning occurs, since the partitioning key 
does not need to be unique.  
Rows are stored in each partition based on the order of the clustering key. Using the clustering key to 
access data will provide the most efficient access method. Therefore, it is recommended that the 
columns used as the clustering key should also be used as the primary key. 
Primary-key Selection 
The selection of a primary key is one of the most important design considerations. The proper key 
structure gives the query an efficient access path, which enables good performance.  
Primary keys for fact tables normally are multicolumn keys composed of foreign key relationships with 
dimension tables. The combination of dimension column values makes each fact table row unique. Since 
the primary key is multicolumn, you can select the column order, which can have a substantial impact on 
query performance.  
Generally, the primary-key order is determined by the data-access requirements. That is, query 
requirements usually are given primary consideration, while load requirements are secondary. Columns 
that restrict the search space the most, and are used often by queries, are good candidates for the high-
order column. When different queries access the data through different columns, you must determine 
column placement based on additional considerations. You also might need to create a secondary 
index on certain columns to avoid full-table scans, although, for many cases, MDAM query technology 
eliminates this need. (Refer to the “MDAM query technology design considerations” section of this 
document for more details.) 
The principles of primary-key selection also apply to dimension tables. However, dimension tables usually 
have fewer columns and require fewer considerations for their design. Primary keys for dimension tables 
are usually composed of a single column, although sometimes multicolumn keys are used for tables that 
have meaningful keys.  
Listed below are the most important considerations for primary-key selection. Determine the priority of 
these items based on performance needs: 
•  Data-access efficiency 
•  Cardinality 
•  Partitioning 










