SQL/MX Release 2.0 Best Practices

ManualsBrandsHP ManualsServerHP NonStop G-Series

Project Activities Guideline for Database Activities 10

Query Analysis

• Query types. Determine the query types and rank by importance. Group them by complexity.

Determine frequency of execution. Limit your analysis to approximately 10 to 20 queries. More would

be too complex; fewer would not represent business requirements.

• Query structure. Are the queries produced by individuals or intelligent clients? Are there multistep or

multipass queries?

• Response time requirements. Determine the response-time requirements for each type of query.

Determine which queries require online response times and which queries can use deferred

response times.

Logical Access

• Translate queries. Translate business queries to SQL queries. Identify tables, joins, predicates,

groupings, aggregations, and so on.

• I/O analysis. Estimate the number of rows accessed and the number of rows used. Evaluate these

numbers against response-time requirements.

• Logical query analysis. Perform logical analysis using the Asset BMC product.

Client Tools

• Tool categories. Determine the categories of tools that must be supported for each user group.

• Tool selection. In each category, choose two or three tools that must be supported for evaluation

(for example, relational online analytical processing [ROLAP], statistical, ad hoc, and so on). Limit the

number of tools to be evaluated. Note that this number may be driven by user preferences or

installed tools, or both. Is the tool “database aware”? Does the tool aggregate tables, insulate the

user from physical changes, and so on?

• Tool database requirements. Determine the space, metadata, and platform requirements; the

connectivity requirements; data-type compatibility; naming-convention requirements, and so on.

• Data mining. Will data-mining tools be used against data extracts? Can data-mining tools that

access NonStop SQL/MX be used directly? (This determination might require a common definition for

the term “data mining.” Many users misuse this term, thinking that ROLAP and data mining are the

same.)

Physical Design

Denormalization Analysis

• Join elimination. Eliminate joins, define aggregate tables, and resolve data-sparsity issues.

• Aggregate definitions. Identify which aggregates are frequently used. Estimate aggregate disk

space. Identify the impact on load and update processing. Does the potential exist for significant

positive-performance impact?

• Data sparsity. Will combining tables create data sparsity? (For example, combining transaction

tables and account history tables would improve account history queries, but not every account

has activity every day.)

Partition and Primary Key Analysis

• Analysis of time-ordered data. Partitions by date allow for fast loads and archiving, but create hot

spots. Partitions by another dimension key can create fragmentation, reducing sequential I/O and

creating need for frequent reorganization.

• Hash partition analysis. What is the impact on load and update processing and on the archive

strategy? Should the sequential I/O be increased or decreased?

• Skewed data analysis. Do candidate columns for the primary key contain skewed data?

• Fragmentation analysis. What degree of fragmentation is introduced by the data-load cycle? How

much of the database must be reorganized and how often must reorganization occur to eliminate

the fragmentation?