SQL/MX Release 2.0 Best Practices

Project Activities Guideline for Database Activities 11

• Primary key analysis of fact tables. Are multicolumn keys needed or desired? Determine key

distributions by partition. Consider the effect on MDAM query technology processing.

• Primary key analysis of dimension tables. For a star schema, typically choose single-column keys. To

use multicolumn keys, you must revisit the data model. Is the snowflake design technique a better

choice?

• Map partitions to configuration. Determine which partitions of which tables to collocate. How would

partitions affect fact and dimension tables and query-plan generation and execution? Also,

consider the effect of partitioned and non-partitioned dimension tables.

• Analyze artificial values against intelligent values and derived key values. Is key mapping necessary?

What is the effect of key mapping on query performance? Can MDAM query technology be used

for intelligent keys? Derived keys or meaningful keys can be exploited to avoid joins. Will dimension-

key changes cascade to the fact tables?

Secondary Index Analysis

• Design for MDAM query technology. Avoid the use of secondary indexes on fact tables. Small

dimension tables might not benefit from secondary indexes.

• Design for index-only access. If secondary indexes are used, can index-only access be designed in?

Create Catalog and Schema

• Create the catalog and schema. Create the system and database catalog and schema.

• Create the Data Definition Language (DDL). Build the DDL to reflect the specifications listed in this

section.

• Create scripts to manipulate and manage the schema. Scripting may be done using utilities, editors,

and so on, as defined by the project.

Space Analysis

Temporary space analysis. How much space is needed for hashing of temp tables, for client tools, for sort

space, for refresh and staging areas, and for anticipated growth?

Physical Access

• Location of catalog and schema, application, and audit. Determine the impact of metadata

placement on query performance. Determine the correct audit configuration.

• View analysis. Analyze the views to hide partitions and force joins. Hide some physical constructs

from client tools.

• Analysis of parallelism. Will parallel query plans be generated? Will parallel plans degenerate into

serial access? How many executor server processes (ESPs) will be spawned?

Insert and Update

• Change data capture. Where does data capture occur? How is data migrated to the data-load

process? Are updates applied to history?

• Data transformation. Translate, encode, decode, and recode.

• Cleanse, correct, and discard. Reconcile, integrate, combine, and de-dup.

• Dimension table updates. Are dimension-table updates changing slowly? Update, insert, and

reload.

• Audit implications. Audit is always on. Inserts generate audit but immediate dump is necessary if the

data is high volume.

• Keys for aggregates. What is the use of dimension keys compared to artificial keys?

• Management of transactions. How and when are transactions bracketed?

• Recoverability and restart ability. Build robust code. Do tables need to carry a “last update”

column?