SQL/MX Release 2.0 Best Practices
Project Activities Guideline for Database Activities 10
Query Analysis
• Query types. Determine the query types and rank by importance. Group them by complexity.
Determine frequency of execution. Limit your analysis to approximately 10 to 20 queries. More would
be too complex; fewer would not represent business requirements.
• Query structure. Are the queries produced by individuals or intelligent clients? Are there multistep or
multipass queries?
• Response time requirements. Determine the response-time requirements for each type of query.
Determine which queries require online response times and which queries can use deferred
response times.
Logical Access
• Translate queries. Translate business queries to SQL queries. Identify tables, joins, predicates,
groupings, aggregations, and so on.
• I/O analysis. Estimate the number of rows accessed and the number of rows used. Evaluate these
numbers against response-time requirements.
• Logical query analysis. Perform logical analysis using the Asset BMC product.
Client Tools
• Tool categories. Determine the categories of tools that must be supported for each user group.
• Tool selection. In each category, choose two or three tools that must be supported for evaluation
(for example, relational online analytical processing [ROLAP], statistical, ad hoc, and so on). Limit the
number of tools to be evaluated. Note that this number may be driven by user preferences or
installed tools, or both. Is the tool “database aware”? Does the tool aggregate tables, insulate the
user from physical changes, and so on?
• Tool database requirements. Determine the space, metadata, and platform requirements; the
connectivity requirements; data-type compatibility; naming-convention requirements, and so on.
• Data mining. Will data-mining tools be used against data extracts? Can data-mining tools that
access NonStop SQL/MX be used directly? (This determination might require a common definition for
the term “data mining.” Many users misuse this term, thinking that ROLAP and data mining are the
same.)
Physical Design
Denormalization Analysis
• Join elimination. Eliminate joins, define aggregate tables, and resolve data-sparsity issues.
• Aggregate definitions. Identify which aggregates are frequently used. Estimate aggregate disk
space. Identify the impact on load and update processing. Does the potential exist for significant
positive-performance impact?
• Data sparsity. Will combining tables create data sparsity? (For example, combining transaction
tables and account history tables would improve account history queries, but not every account
has activity every day.)
Partition and Primary Key Analysis
• Analysis of time-ordered data. Partitions by date allow for fast loads and archiving, but create hot
spots. Partitions by another dimension key can create fragmentation, reducing sequential I/O and
creating need for frequent reorganization.
• Hash partition analysis. What is the impact on load and update processing and on the archive
strategy? Should the sequential I/O be increased or decreased?
• Skewed data analysis. Do candidate columns for the primary key contain skewed data?
• Fragmentation analysis. What degree of fragmentation is introduced by the data-load cycle? How
much of the database must be reorganized and how often must reorganization occur to eliminate
the fragmentation?