HP NonStop SQL/MP Query Guide Abstract This manual describes how to write queries for an HP NonStop™ SQL/MP database. Users who want information on how to use the SELECT statement, as well as those who program or manage a NonStop SQL/MP database, will find this manual helpful. Product Version NonStop SQL/MP G07 Supported Release Version Updates (RVUs) This publication supports D31.00 and all subsequent D-series RVUs and G06.
Document History Part Number Product Version Published 093964-000 NonStop SQL/MP D30 December 1994 118375-001 NonStop SQL/MP D30 February 1996 524488-001 NonStop SQL/MP D30 August 2002 524488-002 NonStop SQL/MP G07 February 2004 524488-003 NonStop SQL/MP G07 November 2004
HP NonStop SQL/MP Query Guide Index Examples What’s New in This Manual xi Manual Information xi New and Changed Information About This Manual xiii Who Should Use This Manual? Prerequisites xiii Organization xiii Related Manuals xiv Notation Conventions xvi Figures Tables xi xiii 1.
1. Retrieving Data: How to Write Queries (continued) Contents 1.
2. The Optimizer (continued) Contents 2. The Optimizer (continued) Processor Assignment by the SQL/MP Optimizer and Executor for Executor Server Processes (ESPs) 2-5 3.
Contents 3. Improving Query Performance Through Query Design (continued) 3.
4. Improving Query Performance With Environmental Options (continued) Contents 4.
6. Analyzing Query Performance Contents 6.
Contents 6. Analyzing Query Performance (continued) 6.
Examples (continued) Contents Examples (continued) Example 1-6. Example 1-7. Example 1-8. Example 1-9. Example 1-10. Example 6-1. Example 6-2. Example 6-3. Example 6-4. Example 6-5. Example 6-6. Example 6-7. Example 6-8. Example 6-9. Example 6-10. Example 6-11. Example 6-12. Example 6-13. Example 6-14. Example 6-15. Example 6-16. Example 6-17. Example 6-18. Example 6-19. Example 6-20. Example 6-21. Example 6-22. Example 6-23. Example 6-24. Example 6-25. Example 6-26. Example 6-27. Example 6-28.
Examples (continued) Contents Examples (continued) Example 6-29. Example 6-30. Example 6-31. Example 6-32. Example 6-33. Example 6-34. Example 6-35. Example 6-36. Example 6-37. Example 6-38. Example 6-39. Example 6-40. Example 6-41. Example 6-42. Example 6-43. Example 6-44. Example 6-45. Example 6-46. Example 6-47. Example 6-48. Example 6-49. Example 6-50. Example 6-51. Example 6-52. Example 6-53.
Figures (continued) Contents Figures (continued) Figure 3-2. Figure 3-3. Figure 3-4. Figure 4-1. Figure 4-2. Figure 4-3. Figure 4-4. Sort Merge Join 3-27 Key-Sequenced Merge Join 3-29 Hash Function Example 3-30 Parallel Execution of a SELECT Statement 4-15 Single-Row Access 4-22 Real Sequential Block Buffering (RSBB) 4-23 Virtual Sequential Block Buffering (VSBB) 4-25 Tables Table i. Table 1-1. Table 1-2. Table 1-3. Table 1-4. Table 1-5. Table 3-1. Table 3-2. Table 3-3. Table 3-4. Table 4-1. Table 5-1.
What’s New in This Manual Manual Information HP NonStop SQL/MP Query Guide Abstract This manual describes how to write queries for an HP NonStop™ SQL/MP database. Users who want information on how to use the SELECT statement, as well as those who program or manage a NonStop SQL/MP database, will find this manual helpful. Product Version NonStop SQL/MP G07 Supported Release Version Updates (RVUs) This publication supports D31.00 and all subsequent D-series RVUs and G06.
New and Changed Information What’s New in This Manual HP NonStop SQL/MP Query Guide —524488-003 xii
About This Manual NonStop SQL/MP is an HP implementation of a relational database management system that uses the industry-standard Structured Query Language (SQL) to define and manipulate data.
Related Manuals About This Manual Table i. Summary of Contents (page 2 of 2) Section 4, Improving Query Performance With Environmental Options Discusses several environmental factors that can influence query performance Section 5, Selectivity and Cost Estimates Discusses how the optimizer estimates selectivity, assigning cost to a query, evaluating cost estimates, how the optimizer chooses an execution plan, and forcing execution plans.
Related Manuals About This Manual Figure i.
Notation Conventions About This Manual Notation Conventions General Syntax Notation This list summarizes the notation conventions for syntax presentation in this manual. UPPERCASE LETTERS. Uppercase letters indicate keywords and reserved words. Type these items exactly as shown. Items not enclosed in brackets are required. For example: MAXATTACH lowercase italic letters. Lowercase italic letters indicate variable items that you supply. Items not enclosed in brackets are required.
General Syntax Notation About This Manual braces on each side of the list, or horizontally, enclosed in a pair of braces and separated by vertical lines. For example: LISTOPENS PROCESS { $appl-mgr-name } { $process-name } ALLOWSU { ON | OFF } | Vertical Line. A vertical line separates alternatives in a horizontal list that is enclosed in brackets or braces. For example: INSPECT { OFF | ON | SAVEABEND } … Ellipsis.
General Syntax Notation About This Manual HP NonStop SQL/MP Query Guide —524488-003 xviii
1 Retrieving Data: How to Write Queries A query is a statement that requests data from a database. This section describes how to write queries for a NonStop SQL/MP database. You can specify a query explicitly by using interactive SELECT statements, application-embedded SELECT and CURSOR statements, and report writer selections. You can specify a query implicitly in UPDATE, INSERT, and DELETE statements. A query can use either dynamic or static SQL.
Retrieving Data: How to Write Queries Using the SELECT Statement SQL). For more information, see Using the SELECT Statement in Programs on page 1-10. These related topics, discussed in other manuals, might also be of interest: • • To modify data with an UPDATE, INSERT, or DELETE statement, use query components. For more information about UPDATE, INSERT, and DELETE statements, see the SQL/MP Reference Manual and the SQL/MP Programming Manual for your host language.
Selecting Columns Retrieving Data: How to Write Queries Selecting Columns Selecting columns from a table is known as projection. The query in Figure 1-1 selects three columns: FIRST_NAME, LAST_NAME, and DEPTNUM. Figure 1-1. Selecting Columns From a Table (Projection) SELECT FIRST_NAME, LAST_NAME, DEPTNUM FROM EMPLOYEE EMPLOYEE EMPNUM FIRST_NAME LAST_NAME DEPTNUM JOBCODE SALARY 1 ROGER GREEN 9000 100 175500.00 23 JERRY HOWARD 1000 100 137000.
Selecting Rows Retrieving Data: How to Write Queries Selecting Rows Selecting rows in a table is called restriction. Figure 1-2 shows a SELECT statement that selects specific rows from the EMPLOYEE table and shows the result table. The SELECT statement uses a WHERE clause and predicates to restrict the number of rows returned: return only those employees who are in department number 9000. Figure 1-2.
Retrieving Data: How to Write Queries Organizing Results The WHERE clause, used in these examples, is described in Specifying Search Conditions on page 1-8. The ORDER BY Clause If you want your report to list employees from highest paid to lowest paid, you can add an ORDER BY clause, as shown in Example 1-1. The DESC keyword tells SQL to sort in descending order; the report lists Ben Henderson, who makes $65,000, before Mary Miller, who makes $56,000.
Retrieving Data: How to Write Queries Organizing Results The DISTINCT Clause The DISTINCT clause eliminates duplicate rows from the result table. Consider the query in Example 1-2. Part numbers 212 and 244 appear several times in the result. Example 1-2. SELECT Statement With Duplicate Rows SELECT PARTNUM FROM ODETAIL ; PARTNUM ------244 2001 . . 244 5103 . . 244 . . 212 . . 212 7301 --- 72 row(s) selected. To eliminate the duplicate rows, you can add a DISTINCT clause, as shown in Example 1-3.
Retrieving Data: How to Write Queries Organizing Results The GROUP BY Clause The GROUP BY clause groups rows with the same value and returns one row per group. The GROUP BY clause, like the DISTINCT clause, removes duplicate rows from the result, as well as performing other functions. To show how the GROUP BY clause works, consider the query from the previous subsection: SELECT PARTNUM FROM ODETAIL ; This query returns 72 rows, with part numbers 212 and 244 appearing several times in the result.
Retrieving Data: How to Write Queries Specifying Search Conditions Example 1-5. SELECT Statement With GROUP BY Clause and SUM Function SELECT PARTNUM, SUM (QTY_ORDERED) FROM ODETAIL GROUP BY PARTNUM ; PARTNUM (EXPR) ---------------212 20 244 47 . . . . 7301 96 --- 27 row(s) selected. When evaluating the GROUP BY clause, SQL considers all null values to be equal. The result can have at most one null group.
Specifying Search Conditions Retrieving Data: How to Write Queries The WHERE clause and HAVING clause are described next. The ON clause is described in Combining Data From More Than One Table on page 1-51. The WHERE Clause Suppose that you want a report of all employees whose salaries are greater than $50,000. You can add a WHERE clause to restrict the number of rows returned. Only those employees who earn more than $50,000 are included in the report. Example 1-6 shows the query and its results.
Retrieving Data: How to Write Queries Using the SELECT Statement in Programs Using the SELECT Statement in Programs A SELECT statement in a program typically retrieves data into a host variable. You can use two types of SELECT statements in a program: • • A single-row SELECT (also called a singleton or standalone SELECT) that returns a single row or value. A multiple-row SELECT (also called a cursor SELECT) that returns multiple rows one row at a time.
Retrieving Data: How to Write Queries Using the SELECT Statement in Programs The INTO clause of the SELECT statement is used to return a single-row result of a query to a host variable. Here is an example of a single-row SELECT statement that selects by a primary-key column, col1: EXEC SQL SELECT col2, col3, col4 INTO :hv2, :hv3, :hv4 FROM MYTABLE WHERE col1 = :hvkey END-EXEC.
Retrieving Data: How to Write Queries Using the SELECT Statement in Programs Multiple-Row (Cursor) SELECT A multiple-row SELECT statement returns multiple rows one row at a time. This technique is usually preferred over a single-row SELECT when retrieving multiple rows. A host variable cannot hold data from more than one row, so you must declare a cursor for this type of SELECT statement. A cursor is the mechanism for dealing with a set of rows returned in sequence to an application program.
Retrieving Data: How to Write Queries Using the SELECT Statement in Programs FETCH listnext INTO :part-no, :part-desc, :price, :qty_available END-EXEC. EXEC SQL CLOSE listnext END-EXEC. A row is returned each time the FETCH statement is executed. This example retrieves all the rows with partnum values greater than the :hvkey value. Note.
Retrieving Data: How to Write Queries Using Null Values The FREE RESOURCES statement is usually more efficient than CLOSE CURSOR unless only a small percentage of defined cursors are active. Note. A FREE RESOURCES statement is required for nonaudited tables to release locks.
Retrieving Data: How to Write Queries Using String Functions Null values are not the same as blanks. Two blanks can be compared and found equal, while the equivalence of two null values is indeterminate. Similarly, null values are not the same as zeros. Zeros can participate in arithmetic operations, while null values are excluded from arithmetic. To determine whether a column accepts null values, you can query the COLUMNS catalog table, or you can invoke a table description in the SQL format.
Retrieving Data: How to Write Queries Extracting Part of a String In the first example, the extracted string starts from the eighth position of the original string, “ROBERT JOHN SMITH”, and extends for four characters. “JOHN” is the result. In the second example, the extracted string starts from the eighth position of the original string and extends until the end. “JOHN SMITH” is the result. The substring in the third example is the whole string. “ROBERT JOHN SMITH” is the result.
Retrieving Data: How to Write Queries Searching for a String Within a String Substring Results That Are Null If the character string, the starting position, or the substring length is a null value, the result is null. Substring Results That Are Empty Strings Sometimes a SUBSTRING function returns a result that is an empty string. An empty string is a string with a length of 0 (“”), which is not the same as a null value.
Retrieving Data: How to Write Queries Searching for a String Without Regard for its Case You can optionally specify which occurrence of the substring you are seeking; for example, you can specify the first occurrence or the third. The data type of the occurrence is unsigned numeric with a scale of 0. The result for this search is 5: POSITION ("IS" IN "MISSISSIPPI", 2) If no substring is found, the function returns 0. If you omit occurrence, then the function returns the first occurrence of the substring.
Retrieving Data: How to Write Queries Removing Leading or Trailing Characters From a String For fixed-length CHAR columns, the result is the length of the column. For columns defined as VARCHAR, the result is the length of the string in the column.
Using the Concatenation Operator Retrieving Data: How to Write Queries The next example uses an asterisk as a TRIM character and removes leading asterisks from the value in the ADDRESS column: TRIM (LEADING "*" FROM ADDRESS) This example removes trailing blank characters from the value in the LAST_NAME column: TRIM (TRAILING " " FROM LAST_NAME) The resulting string is always VARCHAR. For example, a CHAR or VARCHAR returns a VARCHAR.
Retrieving Data: How to Write Queries Using Date-Time Columns This example removes the trailing blanks after FIRST_NAME and LAST_NAME and inserts one blank between the names: TRIM (TRAILING " " FROM FIRST_NAME || " " || TRIM (TRAILING " " FROM LAST_NAME) The result is: "ROBERT SMITH" For more information on the TRIM function, see Removing Leading or Trailing Characters From a String on page 1-19.
Accessing Date-Time Values Retrieving Data: How to Write Queries Accessing Date-Time Values For these examples, suppose that the PROJECTS table contains the data shown in Table 1-2. The WAIT_TIME column specifies a number of days. Table 1-2.
Accessing Date-Time Values Retrieving Data: How to Write Queries The query returns this result: (EXPR) -------------1988-03-12:20:30 Subtracting an INTERVAL Value From a DATETIME Value This example subtracts an INTERVAL value with a MONTH value from a DATETIME value: SELECT end_date - INTERVAL "1" MONTH FROM projects WHERE project_name = "955" ; The query returns this result: (EXPR) -------------1990-12-20:12:30 In this case, the YEAR value was decremented by 1 because subtracting a month from January 20
Accessing Date-Time Values Retrieving Data: How to Write Queries Multiplying an INTERVAL Value This expression doubles an INTERVAL value: INTERVAL "2-7" YEAR TO MONTH * 2 The result is 5 years 2 months. For example, suppose that you specify this query: SELECT END_DATE + INTERVAL "2-7" YEAR TO MONTH * 2 FROM PROJECTS WHERE PROJECT_NAME = "922" ; The value of the END_DATE column increases by 5 years and 2 months.
Accessing Date-Time Values Retrieving Data: How to Write Queries The date-time qualifiers on both sides of a comparison operator must have the same precision. If, for example, one of your columns contains a fraction value, you might change the other literal to include the fraction column. Alternatively, you could use the EXTEND function to adjust the range. CONVERTTIMESTAMP Function Suppose that the BDAY2 table contains birth dates in Julian timestamp form.
Accessing Date-Time Values Retrieving Data: How to Write Queries DATEFORMAT Function For an example of the DATEFORMAT function, consider modifying the query used previously for the CURRENT function: SELECT PROJ_ID, DATEFORMAT (START_DATE, USA) FROM PID WHERE START_DATE = CURRENT YEAR TO DAY ; The query returns this result. The date is now in USA format: PROJ_ID ----------5551 (EXPR) -----------1992/01/29 --- 1 row(s) selected.
Accessing Date-Time Values Retrieving Data: How to Write Queries EXTEND Function In this example, the DAY, HOUR, MINUTE, SECOND, and FRACTION fields to the right of MONTH are initialized to 01 (for DAY), 00 (for HOUR, MINUTE, and SECOND) and 000000 (for FRACTION): EXTEND ( DATETIME "1989-11" YEAR TO MONTH, YEAR TO FRACTION ) The function returns this value: 1989-11-01:00:00:00.000000 In the next example, the YEAR field to the left of MONTH is initialized to the current year.
Specifying Date-Time Values in Programs Retrieving Data: How to Write Queries Specifying Date-Time Values in Programs In SELECT, UPDATE, DELETE, and SELECT with INSERT statements, date values can be specified in different ways, depending on whether the date value is a date literal or a parameter or host variable in a program.
Retrieving Data: How to Write Queries Correlated Subqueries You can combine these two queries into a single query that contains a subquery, as follows: SELECT LAST_NAME, FIRST_NAME, SALARY FROM EMPLOYEE WHERE SALARY > (SELECT AVG(SALARY) FROM EMPLOYEE) ; The SELECT that appears on the right-hand side of the predicate is the subquery, sometimes called an inner query. The other SELECT, sometimes called the outer SELECT, uses the value or set of values computed by the subquery.
Defining Predicates Retrieving Data: How to Write Queries WHERE SALARY > (SELECT AVG(SALARY) FROM EMPLOYEE EMP2) ; Defining Predicates A predicate is a condition that a row must satisfy to be returned to the application.
Comparison Predicate Retrieving Data: How to Write Queries Example 1-8. Sample Tables for Predicate Examples EMPLOYEE Table EMP_ID -----2703 2705 2906 3598 4096 5361 9069 9502 LAST_NAME --------Smith Simpson Nakagawa Nakamura Murakami Smythe Smith Smithson FIRST_NAME ---------James Travis Etsuro Eichiro Kazuo Roger John Richard DEPT_NUM -------7620 7600 6400 6480 6410 7690 7690 6400 MGR_ID -----2705 6554 6554 2906 3598 9069 2705 6554 SALARY -------47500.00 68000.00 72000.00 50000.00 36000.00 42650.
BETWEEN Predicate Retrieving Data: How to Write Queries Smithson Richard 58300.00 --- 4 row(s) selected. You can also use comparison operators with a subquery that returns a single value (sometimes called a scalar subquery). Suppose that you want to know the name and location of the department that James Smith (employee ID 2703) is in.
LIKE Predicate Retrieving Data: How to Write Queries SQL transforms this kind of predicate into a range predicate, as follows: SELECT LAST_NAME, FIRST_NAME, SALARY FROM EMPLOYEE WHERE SALARY >= 50000 AND SALARY <= 72000 ; The query returns this result: LAST_NAME --------Simpson Nakagawa Nakamura Smithson FIRST_NAME ---------Travis Etsuro Eichiro Richard SALARY -------68000.00 72000.00 50000.00 58300.00 --- 4 row(s) selected.
Retrieving Data: How to Write Queries LIKE Predicate Using LIKE With CHARACTER Columns Columns of data type CHARACTER are fixed length. If the string stored in CHARACTER data columns is shorter than the length of the column, the string is padded with blanks for the length of the column. For example, if you insert “Joe” into a CHAR(6) column, the stored value becomes “Joe ”, which is padded with three blank characters at the end of the string.
Retrieving Data: How to Write Queries Predicates Connected by OR Operators Using LIKE With TRIM You can use LIKE in combination with TRIM to do comparisons. This combination is useful when the host variable is fixed length and the pattern to match is shorter than the length of the host variable.
IN Predicate Retrieving Data: How to Write Queries The query returns this result: EMP_ID -----2705 2906 LAST_NAME --------Simpson Nakagawa FIRST_NAME ---------Travis Etsuro DEPT_NUM -------7600 6400 MGR_ID -----6554 6554 --- 2 row(s) selected. IN Predicate You can use an IN predicate to compare the value of an expression with one or more values of another expression.
EXISTS Predicate Retrieving Data: How to Write Queries SELECT DEPT_NUM, LAST_NAME, FIRST_NAME FROM EMPLOYEE WHERE DEPT_NUM IN (SELECT DEPT_NUM FROM DEPT WHERE DEPT_LOC BETWEEN 900 AND 999) ; The query returns this result: DEPT_NUM -------6400 6480 6410 6400 LAST_NAME --------Nakagawa Nakamura Murakami Smithson FIRST_NAME ---------Etsuro Eichiro Kazuo Richard --- 4 row(s) selected. The list provided by the subquery is a result of the location of the department.
Quantified Predicates Retrieving Data: How to Write Queries EMP_ID -----2705 2906 3598 9069 LAST_NAME --------Simpson Nakagawa Nakamura Smith FIRST_NAME ---------Travis Etsuro Eichiro John --- 4 row(s) selected. Quantified Predicates A quantified predicate always involves a subquery. You can use a quantified predicate to compare an expression that applies to the outer query with all, any, or some of the values returned by a subquery predicate.
IS NULL Predicate Retrieving Data: How to Write Queries SOME and ANY are synonyms. In this example, a SOME query is formulated to select information about employees who are also managers. Note that this example is logically equivalent (that is, it retrieves the same data) to the EXISTS example previously discussed.
Multivalued Comparison Predicate Retrieving Data: How to Write Queries Table 1-4. Evaluation of Expressions That Contain Null Values Expression Type Condition Result Boolean (AND, OR, NOT) Either value is null or both values are null. Null Arithmetic Either value is null or both values are null. Null Aggregate (except COUNT) Expression is evaluated after eliminating nulls. Null if set is empty COUNT DISTINCT Expression is evaluated after eliminating nulls.
Using Multivalued Comparison Predicates in Context-Free Servers Retrieving Data: How to Write Queries Note that there are two conditions for the name Smith, so that the query retrieves information about other employees with the last name of Smith as well as employees whose last name follows Smith in the alphabet.
Aggregate Functions in Predicates Retrieving Data: How to Write Queries Suppose that you want to find the sum of all salaries in the EMPLOYEE table. You can specify this query: SELECT SUM(SALARY) FROM EMPLOYEE ; The query returns this result: (EXPR) ----------413210.00 --- 1 row(s) selected. The query returns the sum of salaries of all employees in EMPLOYEE. Now suppose that you want to find the name of the employee who makes the maximum salary.
Retrieving Data: How to Write Queries Using CASE Expressions Using CASE Expressions A CASE expression evaluates a set of conditions and returns a result that depends on which condition is true.
Evaluating Multiple Conditions Retrieving Data: How to Write Queries This is the result: LAST_NAME ------------CHENG GONZALES LEBLANC PETSKI FIRST_NAME ------------TINA LINDA PIERRE STEVE (EXPR) -------MARRIED SINGLE DIVORCED SINGLE --- 4 row(s) selected. Although the marital statuses of the employees were encoded as 1, 2, and 3, the result contains marital statuses that are decoded.
Computing Aggregates Based on Specific Conditions Retrieving Data: How to Write Queries SELECT LAST_NAME, FIRST_NAME, DEPTNUM, CASE WHEN DEPTNUM = "9000" THEN SALARY * 1.10 WHEN DEPTNUM = "1000" THEN SALARY * 1.12 ELSE SALARY END FROM EMPLOYEE; LAST_NAME ------------CHENG GONZALES LEBLANC PETSKI FIRST_NAME ------------TINA LINDA PIERRE STEVE DEPTNUM ------1000 9000 9000 3500 (EXPR) ---------72800.00 83050.00 40700.00 50000.00 --- 4 row(s) selected.
Computing Aggregates Based on Specific Conditions Retrieving Data: How to Write Queries Using CASE, you aggregate the necessary information with a single scan of the EMPLOYEE table: SET LIST_COUNT 0 ; SELECT DEPTNUM, SUM (CASE WHEN SALARY < 20000 THEN 1 ELSE 0 END), SUM (CASE WHEN SALARY < 50000 THEN 1 ELSE 0 END), SUM (CASE WHEN SALARY < 200000 THEN 1 ELSE 0 END) FROM PERSNL.EMPLOYEE GROUP BY DEPTNUM ; DETAIL DEPTNUM, COL 2 AS I12 HEADING "SAL. < 20000", COL 3 AS I12 HEADING "SAL.
Finding the Highest Value in a Row Retrieving Data: How to Write Queries These are the values in the table: NAME -------- AGE ------ DEPT ----------- CARS ------ BROWN CHANG GONZALES HO 38 KAPOOR LEBLANC PETSKI YAMASAKI 30 38 22 50 50 50 1 2 1 50 50 40 40 1 0 4 2 50 28 25 23 24 2 --- 8 row(s) selected. You need to compute the number of employees who have one car, two cars, and so on.
Finding the Highest Value in a Row Retrieving Data: How to Write Queries The values in the table are as follows: NAME -------BROWN BYSTROM CHUNG GOMEZ HO MCLAIN MINSKY PONG SCHMIDT SMITH SAT1 ----------- SAT2 ----------- 480 510 725 780 715 600 400 790 580 550 520 715 650 610 680 520 510 720 590 630 --- 10 row(s) selected. You want to list the name and the highest of the two scores for each student.
Retrieving Data: How to Write Queries Converting Long, Narrow Tables Into Short, Wide Ones Converting Long, Narrow Tables Into Short, Wide Ones You might find that you can manipulate data easily when you use long, narrow tables, but for reports, you may prefer short, wide tables. Suppose that you have a table that contains each salesperson’s name, a month number, and a bonus amount for that month. No amounts are null and all are numeric. The primary key is NAME and MONTH.
Converting Long, Narrow Tables Into Short, Wide Ones Retrieving Data: How to Write Queries To produce a report with a row for each salesperson’s name and corresponding bonus amounts for months 1 through 12, you can use CASE with GROUP BY: set list_count 0 ; select name, SUM(CASE when month = 1 then amount SUM(CASE when month = 2 then amount SUM(CASE when month = 3 then amount SUM(CASE when month = 4 then amount SUM(CASE when month = 5 then amount SUM(CASE when month = 6 then amount SUM(CASE when month = 7
Retrieving Data: How to Write Queries Ignoring the Largest and Smallest Values in a Set Ignoring the Largest and Smallest Values in a Set You can use CASE with SUM to retrieve the values in the table used in the previous example and ignore the largest and smallest values in the table: set list_count 0 ; select x.value from data x, data y group by x.value having SUM (CASE when y.value <= x.value then 1 else 0 END) > 1 AND SUM (CASE when y.value >= x.
Retrieving Data: How to Write Queries Combining Data From More Than One Table from each table. The join predicate determines whether or not the columns satisfy a given search condition. Defining Predicates on page 1-30 describes how to use predicates to narrow the range of searching. The same principles apply to join queries. If columns satisfy the condition, the join operation selects the desired columns, concatenates the rows, and returns them to the result table.
Types of Join Queries Retrieving Data: How to Write Queries Figure 1-3. Selecting From Two Tables S ELE CT FIRS T_ NAME, LAS T_NA ME, DEP TNA ME FROM E MPLOY EE , DEP T WHE RE EMP LO YE E. EMP NUM = DEP T.MA NAG ER ORDE R BY DE PT.
Types of Join Queries Retrieving Data: How to Write Queries Do not confuse inner and left join operations with join strategies, such as sort merge, key-sequenced merge, nested, and hash. Join strategies refer to the mechanisms SQL uses to perform joins. These strategies are discussed in Section 3, Improving Query Performance Through Query Design. Note. Certain combinations of left joins, inner joins, and the UNION operator are not valid in a single SELECT statement.
Retrieving Data: How to Write Queries Types of Join Queries Both queries produce the same result. The result contains only those rows that satisfy the join predicate given in the WHERE clause in the first query and the ON clause in the second query, as follows: EMP_NUM ------2705 2906 3598 EMP_NAME ORD_NUM -------------HENNESSY, A. 57 NAKAGAWA, E. 77 CHU, F. 12 --- 3 row(s) selected.
Retrieving Data: How to Write Queries Types of Join Queries Notice also that the ON clause specifies the join conditions. In a left outer join query, the WHERE clause applies restrictions on the result of the join operation. To list employees who do not have departments, you could check for the occurrence of a null value in the DEPT_NUM column of the DEPT table as follows: SELECT S.EMP_NAME FROM SALESEMP S LEFT JOIN DEPT DT ON S.DEPT_NUM = DT.DEPT_NUM WHERE DT.
Types of Join Queries Retrieving Data: How to Write Queries Using a Left Join to Show Hierarchical Relationships You can use the left join operation to display hierarchical relationships among data. These examples use the sample database shown in Example 1-10. Example 1-10.
Types of Join Queries Retrieving Data: How to Write Queries The next query specifies a left join operation. All data is preserved from the DEPT table: SELECT DT.DEPTNAME, E.EMP_LNAME, E.EMP_FNAME, DD.DEP_FNAME FROM DEPT DT LEFT JOIN EMPLOYEE E ON DT.DEPTNO = E.DEPTNO LEFT JOIN DEPD DD ON E.EMPNO = DD.
Restrictions on Join Queries Retrieving Data: How to Write Queries In this case, the left join operation displays this hierarchical relationship: Employee Department Dependent 011 Restrictions on Join Queries There are a few restrictions on the use of the join operation, as follows: • • A shorthand view based on any join cannot be specified as the right table of a left join. A shorthand view whose definition is based on a union of SELECT commands cannot participate in another join operation.
Retrieving Data: How to Write Queries The ON Clause and the WHERE Clause in Join Queries SQL first performs an inner join of the REGION table with the SALESEMP table. The result table of this inner join operation is then left joined with the ORDERS table. Because SALESEMP appears on the left of the keywords LEFT JOIN, SALESEMP is the table that is preserved from the left join operation. The query returns this result: EMP_NUM ------2703 2705 EMP_NAME ORD_NUM -------------MORRISON, J. ? HENNESSY, A.
Retrieving Data: How to Write Queries Using the UNION Operator For example, consider this query: SELECT E.LAST_NAME, E.FIRST_NAME, E.DEPTNUM FROM EMPLOYEE E LEFT JOIN DEPT DT ON E.DEPTNO = DT.DEPTNO WHERE E.LAST_NAME = "SPENCER" AND DT.MANAGER = 23; Table E is preserved, table DT is not. The predicates, therefore, can be evaluated as follows: • • The predicate E.LAST_NAME = SPENCER can be evaluated anytime after SQL has examined a row from E. The predicate DT.
Using the UNION Operator Retrieving Data: How to Write Queries Figure 1-4 shows how a union of two selects, one on LOC01 and one on LOC02, could be useful for determining all record titles available at two different store locations. Figure 1-4.
Retrieving Data: How to Write Queries ORDER BY Clause With UNION Operator To specify a UNION operation, these rules apply: • • Both select lists must specify the same number of columns. Columns in corresponding positions must have compatible data types. (For information on compatible data types, see the SQL/MP Reference Manual.
Retrieving Data: How to Write Queries ° GROUP BY and HAVING Clauses With UNION Operator An integer that indicates the ordinal position of a column, a function, or an expression in the select list This statement shows incorrect use of the ORDER BY clause and UNION operator: SELECT A.PROGRAMNAME FROM $VOL1.PROGCAT.PROGRAMS A ORDER BY A.PROGRAMNAME UNION SELECT B.PROGRAMNAME FROM $VOL2.PROGCAT.PROGRAMS B ; Instead, you should formulate the preceding query as: SELECT FROM UNION SELECT FROM ORDER BY A.
Retrieving Data: How to Write Queries Developing Interactive Multistep Queries Developing Interactive Multistep Queries This subsection describes techniques for developing multistep queries to select data for reports. You can use multistep queries as follows: • • To apply aggregate functions to multiple levels of groups To compute what percent the current row value is of all rows These techniques use temporary tables to select data based on more than one query.
Computing Row Value as a Percent of All Row Values Retrieving Data: How to Write Queries >> CREATE TABLE JOBAVG ( +> DEPTNUM NUMERIC (4) UNSIGNED +> JOBCODE NUMERIC (4) UNSIGNED +> AVGSAL NUMERIC (6) UNSIGNED +> CATALOG TEMPTABS ; NO DEFAULT, NO DEFAULT, NO DEFAULT ) 4. Insert the department number, job code, and average salary for each job in each department in the JOBAVG table: >> INSERT INTO JOBAVG +> (SELECT DEPTNUM, JOBCODE, AVG(SALARY) +> FROM PERSNL.EMPLOYEE +> GROUP BY DEPTNUM, JOBCODE) ; 5.
Computing Row Value as a Percent of All Row Values Retrieving Data: How to Write Queries 1. Create a temporary table to contain the average salary for each department: >> CREATE TABLE TEMPTABS.AVGTEMP ( +> DEPTNUM NUMERIC (4) UNSIGNED +> AVGSAL NUMERIC (6) UNSIGNED +> CATALOG TEMPTABS ; NOT NULL, NOT NULL) 2. Insert the department number and average salary into the temporary table: >> INSERT INTO TEMPTABS.AVGTEMP +> (SELECT DEPTNUM, AVG(SALARY) FROM PERSNL.EMPLOYEE +> GROUP BY DEPTNUM) ; 3.
Retrieving Data: How to Write Queries Computing Row Value as a Percent of All Row Values HP NonStop SQL/MP Query Guide —524488-003 1- 68
2 The Optimizer The NonStop SQL/MP optimizer is a component of the SQL compiler. The optimizer plays an important role in the high-performance operation of SQL by selecting the most efficient access plan for a query. During compilation, the optimizer examines each data manipulation statement and generates query execution plans (also called access plans) to retrieve the requested data from the database.
SQL Components and the Optimizer The Optimizer Figure 2-1. SQL Components That Execute a Query SQLCI Process SQLCI2 Process SQL Executor SQL File System Disk Process Disk Process $V1.SVOL.TABLE1 $V2.SVOL.TABLE2 Disk Process $V3.SVOL.TABLE3 Disk Process $V4.SVOL.TABLE4 VST0201.vsd 014 For embedded SQL, the model in Figure 2-1 would not include the SQLCI process, and the SQLCI2 process would be replaced by the user process. In Figure 2-1, each disk process communicates with a different table.
SQL Executor The Optimizer SQL Executor The executor is a set of system library procedures that executes compiled DML statements against database tables, views, or the database catalogs. To execute a DML statement, the executor uses the query execution plan generated and optimized by the SQL compiler.
How the Optimizer Chooses an Execution Plan The Optimizer • • • • • • • • Execution step at which a predicate or subquery should be evaluated The sequence (join order) in which tables will be scanned The join strategy—hash, sort merge, key-sequenced merge, or nested Optimal sort strategy, by the following: ° Examining the sort keys for any ORDER BY, GROUP BY, or DISTINCT requests, and combining the sorts whenever possible ° Eliminating sorts for ORDER BY, GROUP BY, or DISTINCT requests if the chosen
The Optimizer Processor Assignment by the SQL/MP Optimizer and Executor for Executor Server Processes (ESPs) Processor Assignment by the SQL/MP Optimizer and Executor for Executor Server Processes (ESPs) Apply these rules: 1. The SQL/MP optimizer tries to assign the ESP the same processor as that assigned for the primary disk process of the table partition that the ESP will access. 2.
The Optimizer Processor Assignment by the SQL/MP Optimizer and Executor for Executor Server Processes (ESPs) HP NonStop SQL/MP Query Guide —524488-003 2 -6
3 Improving Query Performance Through Query Design You can formulate the same NonStop SQL/MP query in a number of different ways. Some formulations perform better than others because SQL requires less work to return the same result. This section describes how to write queries so that they capitalize on SQL performance features. The guidelines in this section focus on individual SQL statements.
Improving Query Performance Through Query Design Selecting Columns for Faster Data Access For information about database management options and directives that can influence query performance, see Section 4, Improving Query Performance With Environmental Options. Several management options can be used interactively from SQLCI as well as from a program. Note. This manual supports NonStop SQL/MP D30.02 and D30.03.
Improving Query Performance Through Query Design Preparing Queries The query returns this result: EMPNUM FIRST_NAME ------ ------------1 ROGER . . 568 JESSICA LAST_NAME DEPTNUM JOBCODE SALARY ------------- ------- ------- ---------GREEN 9000 100 175500.00 . . . . . . CRINER 3500 300 39500.00 --- 57 row(s) selected. To improve the performance of the query, you can refine the query by specifying only the columns you really need, reducing the amount of data SQL must return.
Improving Query Performance Through Query Design How the Optimizer Processes Predicates For more information about EXPLAIN, see Section 6, Analyzing Query Performance. How the Optimizer Processes Predicates A predicate is an expression that makes an assertion about data. This subsection describes how predicates are classified by the optimizer, how they can be transformed by SQL prior to evaluation, and how SQL evaluates predicates.
Improving Query Performance Through Query Design Transformation of Predicates Some WHERE clauses cannot be processed by MDAM as single predicate sets. Usually these clauses contain one or more ORs and must be processed as multiple predicate sets.
Improving Query Performance Through Query Design Transformation of Predicates WHERE UNIQUE2 = 4 OR (UNIQUE2 = 5 AND FOUR = 2); Transformation of LIKE Predicates A LIKE predicate searches for rows that match a pattern. When using LIKE against a positioning column of an index, if the match is on a leftmost matching string (a literal beginning with anything other than the pattern match symbol, % or _), SQL transforms the statement into an equivalent range predicate.
Improving Query Performance Through Query Design Transformation of Predicates Transformation of BETWEEN Predicates SQL transforms a BETWEEN predicate into the equivalent range predicate; for example: X BETWEEN Y AND Z is transformed to X >= Y AND X <= Z Transformation of Predicates With the NOT Operator A predicate with one or more NOT operators is transformed to simplify and reduce the number of NOT operations; these series of transformations illustrates the process: NOT ((a > b) OR (x < y)) becomes: NO
Improving Query Performance Through Query Design Transformation of Predicates With MDAM, only rows that match the values in the value list are read. Unlike OR optimization, when a predicate has more than one element in the value list, MDAM eliminates any duplicate values at run time, not at compile time. Because this is done before any tables are accessed, there is no performance penalty. MDAM processing appears as an MDAM predicate set in the EXPLAIN plan.
Improving Query Performance Through Query Design • Transformation of Predicates Queries written by users who are not NonStop SQL/MP experts Optimization time can increase when these transformations take place. Although the transformations are automatic, you can use a DEFINE to control compilation time and still receive optimization benefits. For information on the DEFINE, see Controlling the Expansion of Predicates on page 3-20.
Improving Query Performance Through Query Design Transformation of Predicates In this query, the left join cannot be converted to an inner join: SELECT * FROM V WHERE C = 1 OR D = 2 ; The search condition C = 1 OR D = 2 selects rows in which C = 1, regardless of the value of D. Suppose that the left-join operation generates a null-augmented row in which C = 1 and D is null. A logical OR is true if either of its operands is true.
Improving Query Performance Through Query Design Transformation of Predicates Expanding Equality Predicates You can code the same query in several different ways. Each way generates the same result, but if Query Rewrite did not exist, the queries could differ widely in performance, as in these examples: SELECT * FROM T, U WHERE T.C = :hv AND U.D = :hv ; SELECT * FROM T, U WHERE T.C = U.D AND T.C = :hv ; SELECT * FROM T, U WHERE T.C = U.D AND U.D = :hv ; SELECT * FROM T, U WHERE T.C = U.D AND T.
Improving Query Performance Through Query Design Evaluation of Predicates Simplifying Predicates Sometimes the optimizer can determine the results of predicates at SQL compilation time and then simplify or eliminate the predicates. When the optimizer compares values known to be equal, it substitutes a NOT NULL predicate if both these conditions occur: • • The expression can be null. The operator is one of the following: <=, =, or >=.
Improving Query Performance Through Query Design Evaluation of Predicates If a query execution plan changes (because of new statistics or a new index, for example) the category of a predicate might change. These paragraphs describe the four evaluation categories of predicates. Key Predicates A key predicate is a begin key or an end key that defines a lower or upper bound on key columns for sequential retrieval.
Improving Query Performance Through Query Design • Evaluation of Predicates If real sequential block buffering (RSBB) is chosen, the file system evaluates index predicates. For more information about RSBB, VSBB, and other buffering options, see Section 4, Improving Query Performance With Environmental Options. To review the index predicates chosen for a scan, see the EXPLAIN listing for the query. Index predicates are noted with the title “Index pred.
Improving Query Performance Through Query Design Writing Efficient Predicates can reduce the amount of data to be sorted or hashed for sort merge joins, hash joins, or aggregate functions. Executor Predicates An executor predicate is a predicate that must be evaluated by the SQL executor instead of by the disk process or file system. Executor predicates are the least efficient type of predicate because they reject rows only after the rows have already been handled by the disk process and the file system.
Improving Query Performance Through Query Design Positioning With Key Predicates Positioning With Key Predicates Key predicates can greatly reduce the resources required for a query. To specify a key predicate, use a WHERE clause that restricts the search based on the primary key or an index.
Improving Query Performance Through Query Design Positioning With Key Predicates Examples of Key Predicates This query specifies a begin key and uses a host variable. If an index exists on the column LAST_NAME and if SQL uses the index to perform the search, then this query performs better than if SQL sequentially reads every row in the table, starting with the first row.
Improving Query Performance Through Query Design Specifying Join Predicates Using Range Predicates for Positioning A range predicate on a column of the key can be used for key positioning, but subsequent key columns cannot be used for key positioning and instead are evaluated as index or base table predicates.
Improving Query Performance Through Query Design Specifying Multivalued Predicates Specifying Multivalued Predicates You can influence the selection of an access path—and eliminate extra scanning—by specifying more than one column or value on each side of a predicate. Such a predicate, called a multivalued predicate, compares multiple columns with corresponding values. This strategy can be especially useful for key prefixes.
Improving Query Performance Through Query Design Controlling the Expansion of Predicates If the application does not permit a uniform ordering of key columns, an alternative is to truncate the multivalue predicate so that it excludes the nonconforming column. Controlling the Expansion of Predicates If the optimizer determines additional equality predicates are useful, then it considers each equivalence class separately for expansion. You can affect the expansion by using a DEFINE for =_SQL_CMP_EQ_LIMIT.
Improving Query Performance Through Query Design Controlling the Expansion of Predicates Table 3-2. Rules for Expansion of Useful Equality Predicates Condition Expansion Example The equivalence class contains a constant expression. Equality predicates are generated between all pairs of members that are single columns of different tables so that a hash join, sort merge join, or keysequenced merge join can occur.
Improving Query Performance Through Query Design Using OR Operators in Predicates Using OR Operators in Predicates SQL uses a feature called OR optimization for some queries with OR operations. OR optimization uses more than one access path to obtain the data and eliminates duplicate predicates at compile time to produce the result. MDAM is another feature that works with OR operators.
Improving Query Performance Through Query Design • • Using OR Operators in Predicates At least one single predicate—or set of predicates connected by AND operators— contains an executor predicate. A group of predicates connected by AND operators has a high selectivity, which is likely to occur when the operator is not an equal operator (=); instead, SQL might choose to read the table sequentially to search for rows that satisfy the query.
Improving Query Performance Through Query Design Using LIKE Predicates Using LIKE Predicates LIKE constructs can cause full table scans and can therefore result in inefficient queries. Avoid using the LIKE predicate when another operator might be more efficient. Instead, use the equal operator (=) whenever possible.
Improving Query Performance Through Query Design Nested Join query. In this discussion, an outer table is a table that is examined before another table. An inner table is a table examined after the outer table. Two tables can be joined even if there are no joining predicates. In this case, SQL creates the new table by concatenating every row in one table with every row in the other table, using a nested join strategy.
Improving Query Performance Through Query Design Sort Merge Join predicates, it chooses the column with the lowest selectivity. That is, it picks the column that it expects to have the fewest matches between values in the inner and outer table rows. If the outer table is not already sorted on the joining column, the table is sorted into a temporary entry-sequenced table. The join is done between the temporary sorted inner table and the outer table.
Improving Query Performance Through Query Design Key-Sequenced Merge Join Figure 3-2. Sort Merge Join Only if outer table is not in sorted order Outer table: One sequential pass Inner table: One sequential pass Sort Sort Temporary file Temporary file Sort-merge join result 028 VST0302.vsd For information on forcing a merge join, see the SQL/MP Reference Manual. Also, see Specifying a Join Method on page 3-43.
Improving Query Performance Through Query Design Key-Sequenced Merge Join The read process for the key-sequenced merge join is the same as that for the sort merge join, except for the third step. A row is retrieved from the outer table, another row is retrieved from the inner table, and the values of the join columns for the two rows are compared as follows: 1.
Improving Query Performance Through Query Design Hash Join Figure 3-3. Key-Sequenced Merge Join Outer table: One sequential pass Inner table: S equential acc ess Key-sequenced merge join result 0 29 VST0303.vsd The key-sequenced merge join method differs from a nested join in the way that records are retrieved from both the outer and inner tables. The key-sequenced merge join method may read several rows on the outer table before retrieving a row from the inner table.
Improving Query Performance Through Query Design Hash Join A hash join uses a hashing function, rather than indexes or sequential reads, to access specific rows in a file. In general, the process has two phases: 1. The build phase reads the inner table of the join into virtual memory and builds an in-memory hash table for it, using a hashing function based on join attributes. The hashing function calculates indexes into an array of values in main memory.
Improving Query Performance Through Query Design Hash Join Simple Sequential Hash Join To execute a simple sequential hash join, the executor performs two steps. 1. The executor makes a single pass over the inner table of the join, applying selection and projection operations (if possible) to eliminate unnecessary rows and columns of the inner table prior to the actual join operation.
Improving Query Performance Through Query Design Hash Join Parallel Hash Join The optimizer considers a parallel hash-join strategy if CONTROL EXECUTOR PARALLEL EXECUTION ON is specified for the query. SQL supports three types of parallel hash joins: • • • Plain parallel Repartitioned parallel hash Matching partitions Plain Parallel SQL can execute hash joins across processors and across disk partitions. A parallel hash join uses existing partitions and follows these steps: 1.
Improving Query Performance Through Query Design Hash Join Repartitioned Parallel Hash Join A repartitioned parallel hash join reads from existing partitions and repartitions the data across all processors, using these steps: 1. The executor starts an ESP for each partition of the inner table. 2. The executor starts an ESP for each partition of the outer table. (The outer table must have more than one partition.) 3.
Improving Query Performance Through Query Design Determining a Join Strategy Determining a Join Strategy When evaluating join methods, the optimizer looks at ways to join tables, both with structure and ordering of combinations of tables and with various join strategies. The goal is to minimize processor time, disk access, sorts, and other resource use so that resource consumption is minimized, and performance is as fast as possible. The optimizer evaluates performance for each access path to each table.
Improving Query Performance Through Query Design Determining a Join Strategy DEPT_NUM column. The DEPT table has DEPT_NUM as the primary key. this query asks for employee and department information: SELECT EMP_NAME, DEPT_NAME, SALARY FROM EMPLOYEE, DEPT WHERE EMPLOYEE.DEPT_NUM = DEPT.DEPT_NUM ; The information is retrieved by joining the EMPLOYEE and DEPT tables.
Improving Query Performance Through Query Design Determining a Join Strategy In some situations, SQL forms a single Cartesian product that significantly reduces the number of rows to be joined. Consider this example that creates three tables and then selects data from the tables: CREATE TABLE buildng (a INT, m INT, n INT, KEY a) ; CREATE TABLE room (x INT, b INT, KEY x) ; CREATE TABLE locatn (c INT, d INT, e INT, f INT, KEY (c,d)) ; SELECT * from buildng, room, locatn WHERE buildng.a, room.b = locatn.
Improving Query Performance Through Query Design Determining a Join Strategy For each join performed within the join strategy listed in the “Join Strategy” column, the cost increases by the algorithm shown in the “Order of Cost” column. Various factors influence these general rules. Table 3-4 compares join strategies. Table 3-4.
Improving Query Performance Through Query Design Determining a Join Strategy Table 3-4. Comparison of Join Strategies (page 2 of 2) Element of Comparison Nested Join Sort Merge Join Key-Sequenced Merge Join Hash requirements None None None All hash joins hash the inner table. Hybrid hash joins also hash the outer table. Use of MDAM Can be used on inner and outer tables. Can be used on outer table. Can be used on outer table. Can be used on outer table.
Improving Query Performance Through Query Design Writing Efficient Joins In general, the optimizer chooses a hash join in preference to a sort merge join for situations where an equality predicate exists for the join operation. The performance advantage of the hash join increases with increasing size difference between input tables. This advantage decreases when data is not distributed uniformly. The choice between the hash join and the key-sequenced merge join is not so simple.
Improving Query Performance Through Query Design Using Indexes Using Indexes The use of an index improves join performance by eliminating sort operations. In situations where the fastest possible response time is required, do not specify joins where no index exists. If you are not sure which columns are keys or indexes, check with your database administrator. Eliminating Implicit Joins When tables are joined, each new row is formed by concatenating two rows, one from each of the original tables.
Improving Query Performance Through Query Design Adding Join Predicates 100210 300350 600480 800660 --- 4 row(s) selected. To eliminate the unnecessary join and include a join predicate, therefore eliminating the Cartesian result, rewrite the query as follows: SELECT ORDERNUM FROM ODETAIL O WHERE O.PARTNUM = 5100 AND QTY_ORDERED < (SELECT AVG(QTY_AVAILABLE) FROM PARTS P WHERE P.PARTNUM = 5100) ; For a cost analysis of both formulations of the query, see Section 5, Selectivity and Cost Estimates.
Improving Query Performance Through Query Design Using Joins Instead of Subqueries These examples show how to reformulate subqueries into join queries. Consider this query, which contains a subquery: SELECT employee.name FROM employee WHERE employee.dept_no IN (SELECT dept.dept_no FROM dept WHERE dept.name = "development") ORDER BY employee.name ; You can change this query into a join query as follows: SELECT employee.name FROM employee, dept WHERE employee.dept_no = dept.dept_no AND dept.
Improving Query Performance Through Query Design Specifying a Join Method Specifying a Join Method Two directives control the selection of join method: • • The CONTROL QUERY HASH JOIN option specifies whether SQL can use hash joins if the optimizer expects hash joins to improve query performance. The CONTROL TABLE JOIN METHOD option specifies the join method SQL uses when the specified table is the inner table of a join operation. Options include NESTED, MERGE, KEY SEQUENCED MERGE, and HASH.
Improving Query Performance Through Query Design Specifying a Join Method Join method does not affect the first table in a join sequence (the table that ends up as Step 1 of the query plan). Step 1 is always a scan operation; the remaining steps are join operations. (For more information see, Specifying a Join Sequence on page 3-45.) Caution.
Improving Query Performance Through Query Design Specifying a Join Sequence Specifying a Join Sequence The CONTROL TABLE JOIN SEQUENCE option controls the order in which tables are combined in a join operation. You specify access by defining an ordinal position for a table. The statement accepts integers; the table associated with the number one is the first table processed. For example, JOIN SEQUENCE 1 forces SQL to assign the specified table to the first step of the join operation (the outermost loop).
Improving Query Performance Through Query Design How the Optimizer Processes Aggregates and Group-By Operations Confirm all use of this option with data from the Measure product and verify its use periodically to account for changes in data distributions and volumes. Reevaluate its effectiveness with each new version of SQL.
Improving Query Performance Through Query Design Evaluation by the Disk Process If the begin-key and end-key predicates contain only equal comparisons (=), then the MIN or MAX processing of the next key column in sequence, which is not part of the begin or end key, is also evaluated by reading only one row from the index.
Improving Query Performance Through Query Design Evaluation by the Disk Process If there is no GROUP BY clause, disk process aggregation is selected if these conditions are true: • The query does not contain any executor predicates (except HAVING); for example: SELECT SUM(SALARY) FROM EMPDATA WHERE SALARY >= 50000; • • The query uses primary index access or index-only access; that is, all columns referenced by the query can be found in the index.
Improving Query Performance Through Query Design • • Evaluation by the Disk Process The optimizer determines that there is a saving in messages when using disk process aggregation and grouping. This determination is based on costing. The query does not use OR optimization. GROUP BY Using a Parallel Plan In a parallel plan, grouping and aggregation are done by the disk process if this conditions are satisfied: • • The query accesses a single table.
Improving Query Performance Through Query Design Evaluation by the Executor Component MDAM can process GROUP BY predicates in queries that use serial or parallel plans. Evaluation by the Executor Component If an aggregate or GROUP BY operation does not fit the preceding requirements, it cannot be evaluated by the disk process. All participating rows are transferred from the disk process to the executor, where groups are formed and aggregation is done.
Improving Query Performance Through Query Design Sorted GROUP BY Operation For parallel plans, SQL always uses hashed aggregation unless a previous version of software is running that does not support hashed aggregations. Each ESP process does its own aggregation and sends its results to the master ESP. This strategy reduces network traffic by applying grouping and aggregation locally. SQL does not use hashing for columns that use collations.
Improving Query Performance Through Query Design EMP_ID -----2705 2906 3598 9069 FIRST_NAME ---------Travis Etsuro Eichiro John LAST_NAME --------Simpson Nakagawa Nakamura Smith Correlated Subquery MGR_ID -----6554 6554 2906 2705 --- 4 row(s) selected. Note. The preceding example associates the correlation names EMP1 and EMP2 to different instances of the EMPLOYEE table. A correlation name is an SQL identifier that you associate with a table or view.
Improving Query Performance Through Query Design Noncorrelated Subquery A subquery is dependent on the outer query if the subquery is correlated to the outer query, as shown: SELECT ITEM_NAME, RETAIL_PRICE FROM INVNTRY OUTER WHERE RETAIL_PRICE > SELECT AVG(RETAIL_PRICE) FROM INVNTRY WHERE PRODUCER = OUTER.PRODUCER This example selects information on items that cost more than the average price of the items produced by the same producer.
Improving Query Performance Through Query Design Avoiding Full Table Scans Avoiding Full Table Scans Scans of an entire table can be quite costly in terms of performance. Response time is directly proportional to the number of rows or blocks processed. In general, you should avoid online transaction processing queries that invoke full table scans unless the table is quite small, consisting of only a few blocks.
Improving Query Performance Through Query Design Sort Operations the index (the first column is 0). You can also retrieve this information from the indexes table in the catalog. • • • • Use predicates for the leading key columns where possible, because SQL uses predicates for key positioning. As the number of leading key columns with predicates increases, the key positioning becomes more precise. When you use an ORDER BY clause with an index, include all leading columns of the index being used.
Improving Query Performance Through Query Design Sort Operations Sort Implementations Depending upon the query request, SQL determines if a sort of the data rows is required and automatically initiates the sort process. A logical sort can be implemented in one of these ways: • An in-memory user process sort (UPS).
Improving Query Performance Through Query Design Optimizing Combinations of Clauses You can use the EXPLAIN utility to check the type of sort operation performed and the size of the workspace allocated by the executor (UPS workspace) for in-memory sorts. For more information on EXPLAIN, see Section 6, Analyzing Query Performance. Optimizing Combinations of Clauses The SQL optimizer attempts to minimize the number of sorts required for a query.
Improving Query Performance Through Query Design Using Indexes Specifying GROUP BY With DISTINCT You can specify grouping and the elimination of duplicate rows and still avoid extra sorts. A single sort satisfies both grouping and the elimination of duplicate rows when this occurs: • The GROUP BY list is a subset of the SELECT DISTINCT list, as follows: SELECT DISTINCT COUNT(*), B, B-D, D FROM T GROUP BY B, D ; SQL performs a single sort (on B, D) to perform the grouping.
Improving Query Performance Through Query Design Using Indexes specify complete column order, performance is best, but when you cannot, MDAM preserves the order of the key. For information about creating indexes, see the SQL/MP Installation and Management Guide. Examples Consider this query: SELECT * FROM INVNTRY ORDER BY ITEM, RETAIL_PRICE ; If the INVNTRY table is large, the cost of sorting the table might be very high.
Improving Query Performance Through Query Design Writing Efficient Programmatic Statements Writing Efficient Programmatic Statements When writing programmatic queries, you might have a choice between several strategies for combinations of SELECT statements, cursor use, and update, delete, and insert operations. Single-Row and Multiple-Row SELECT Statements A single-row SELECT statement is a request to return a single row to the host program.
Improving Query Performance Through Query Design • • Decision Support Considerations Conversely, when selecting and updating multiple rows, choose multiple updatable cursor UPDATES and DELETES WHERE CURRENT over multiple single-row selects followed by exact updates or deletes. The EXCLUSIVE and REPEATABLE lock mode and access options are recommended for these types of operations. For more information, see Section 4, Improving Query Performance With Environmental Options.
Improving Query Performance Through Query Design • • • • • • • • Online Transaction Processing Considerations Column selectivity. See Combinations of Predicates on page 5-4. Query Rewrite. See Transformations Related to Joins on page 3-8. Reduction of sorts. See Sort Operations on page 3-55. In-memory sorts. See Sort Operations on page 3-55. Hashed groupings. See Hashed Aggregation and Grouping on page 3-50. CASE expressions. See Using String Functions on page 1-15. String functions.
Improving Query Performance Through Query Design Batch Considerations Batch Considerations Batch operations imply queries that process large amounts of data in a sequential order. Use of these can improve batch performance: • • • • • Block buffering, described in Reducing Messages With Buffering Options on page 4-21. Parallel sorts to increase speed and balance processor and disk use. Parallel processes to even out workload and make the system easier to balance.
Improving Query Performance Through Query Design HP NonStop SQL/MP Query Guide —524488-003 3- 64 Batch Considerations
4 Improving Query Performance With Environmental Options The best way to control your NonStop SQL/MP processing environment is to design and maintain your database so that the mix of queries executes efficiently.
Improving Query Performance With Environmental Options Keeping Statistics Current Keeping Statistics Current SQL provides an UPDATE STATISTICS utility to collect and save statistics on columns and tables. The SQL compiler uses these statistics to determine the selectivity of predicates, indexes, and tables. Because selectivity directly influences the cost of access plans, it is important to have current statistics for a table, to increase the likelihood that the optimizer will choose an efficient plan.
Improving Query Performance With Environmental Options • Keeping Statistics Current Always specify the NO RECOMPILE option when using UPDATE STATISTICS, for this reasons: ° By default, an UPDATE STATISTICS operation invalidates dependent programs, even if UPDATE STATISTICS is executed within a transaction that is backed out. Catalogs are audited; program file labels are not. Because program file labels are not audited, updates to program file labels are not backed out.
Improving Query Performance With Environmental Options Optimizing the Access Path Optimizing the Access Path An access path is the method by which data is accessed. Access can be one of these: • • Primary access (table scan or primary key) Alternate-index access Different access paths provide different degrees of efficiency in accessing a table. Primary Access There are two types of primary access: table scan and primary key.
Improving Query Performance With Environmental Options Alternate-Index Access Primary Key Access through a primary key means reading a portion of the base table derived from the primary-key value.
Improving Query Performance With Environmental Options • Using the CONTROL QUERY Directive A table scan can be avoided and the percentage of rows to be read is small enough to make index access cheaper. An index contains one or more columns defined as the index, plus the columns that make up the primary key. An index benefits the query most when all the columns needed by the query are located in the index.
Improving Query Performance With Environmental Options Selecting an Access Path When an Index Is Not Available You should specify INTERACTIVE ACCESS OFF when query or batch processing a large number of rows. The default value is OFF. The CONTROL QUERY directive is most valuable in application use when hostvariable values are not available at compilation time.
Improving Query Performance With Environmental Options Understanding Unexpected Access Paths produced because information might be available for other queries in the same program. At run time, the SQL executor recompiles the query when it encounters the flag that indicates a recompile is necessary. The SQL executor instructs the SQL compiler to ignore unavailable information during this compile. If all information is now available, the most efficient access plan can be selected.
Improving Query Performance With Environmental Options Understanding Unexpected Access Paths Index-Only Access Is Not Possible Index-only access is not used if any of these are true: • • • • The columns required by the query are not all included in the index. An exclusive lock mode is selected. OR optimization is done. (For more information, see Writing Efficient Predicates on page 3-15.
Improving Query Performance With Environmental Options Understanding Unexpected Access Paths If you specify this query: SELECT * FROM T WHERE A > 9 AND B > 28 ; The optimizer operates as if these were true: • • 10 percent of the table will be selected through index A. 20 percent of the table will be selected through index B. Based on this assumption, the optimizer will choose index A. However, because there are 1,000 rows with A = 10, index A will actually access 50 percent of the table.
Improving Query Performance With Environmental Options Specifying an Access Path One way to avoid the Halloween problem is to ignore the index on the column being updated (RETAIL_PRICE in the previous example) and choose another index as the access path, but this can result in an inefficient access plan.
Improving Query Performance With Environmental Options Specifying an Access Path If you specify the SYSTEM option, SQL chooses the access path; this option is the default. If, for example, you specified CONTROL TABLE * ACCESS PATH PRIMARY to force primary key access during an SQLCI session, use CONTROL TABLE * ACCESS PATH SYSTEM to restore the default access path selection mechanism. Whenever you force MDAM by using the CONTROL TABLE directive, you must specify an access path.
Improving Query Performance With Environmental Options Requesting Parallel Processing Requesting Parallel Processing SQL can take advantage of multiprocessor architecture by dividing an SQL query into smaller tasks and assigning the tasks to separate processors. During parallel processing, each part or partition of data is processed in parallel. After all partitions are processed, they are merged to produce the final result.
Improving Query Performance With Environmental Options How Parallel Processing Is Implemented The optimizer does not choose a nested parallel join when stable access is specified and the tables are not partitioned exactly the same and on a single column. The CONTROL EXECUTOR PARALLEL EXECUTION ON directive stays in effect until the end of an SQLCI session or until the directive is specified again.
Improving Query Performance With Environmental Options Requesting Parallel Operations on Partitioned Data The master executor assigns one ESP process to each partition that must be accessed. At run time, the master executor starts an ESP process in the current primary processor of each partition’s disk volume (unless an existing ESP process can be used). Each ESP works only on the partition to which it is assigned.
Improving Query Performance With Environmental Options Specifying Access Option and Lock Characteristics Prior to parallel execution of a SELECT statement, some table might not be partitioned or might be partitioned in a way that does not facilitate parallel processing. The optimizer can request that the executor repartition (reorganize) a copy of the data at run time. During repartitioning, SQL distributes the data over a set of temporary partitions.
Improving Query Performance With Environmental Options Access Option Each row locked with repeatable access stays locked until the corresponding transaction is committed. • Stable access Stable access requires that when a row is retrieved, it must not be locked in an exclusive manner by another transaction. Stable access locks all data accessed through the DML statement. This access is useful for applications that want to read only committed data.
Improving Query Performance With Environmental Options Access Option When using BROWSE ACCESS in SQLCI for a query that might run for a long time, disable transaction generation before starting the query. By doing so, you prevent the lengthy execution of an automatically generated transaction. To disable transaction generation for user-defined DDL or DML statements, execute this statement: SET AUTOWORK OFF ; See the SQL/MP Reference Manual for information on this statement.
Improving Query Performance With Environmental Options Lock Mode Lock Mode Lock mode controls access to locked data. The two modes are these: • Exclusive. The lock owner can access and modify the data. Users who have specified browse access can read the data. This is typically specified for multistep updates or deletes, where data is selected and then updated or deleted, to eliminate the need to convert a shared lock (for the select) to an exclusive lock (for the update or delete).
Improving Query Performance With Environmental Options Waiting For Locks LOCK TABLE is an executable SQL statement. If you use LOCK TABLE, the SQL compiler is not aware of the table lock before compiling subsequent statements because each SQL statement is compiled independently. The LOCK TABLE statement might even be in a separate program from subsequent statements. CONTROL TABLE is a compiler directive.
Improving Query Performance With Environmental Options Reducing Messages With Buffering Options Reducing Messages With Buffering Options This subsection describes how SQL retrieves data from the disk process. Retrieval can be by row or by block. The size of a block is specified for a table when the table is created. Types of Buffering The optimizer can choose between these types of access: • • Single row—SQL returns data from the disk process one row at a time.
Improving Query Performance With Environmental Options Single-Row Access The INVNTRY table has these characteristics: • • • • • • The total size of each row is 428 bytes (20 + 4 + 4 + 400). The table contains 100 rows. There are 90 items with a RETAIL_PRICE value greater than 10. There is no alternate index. The block size is 4096 bytes. There is no slack space in the table.
Improving Query Performance With Environmental Options Real Sequential Block Buffering (RSBB) Real Sequential Block Buffering (RSBB) If a query accesses most of the rows in a table and most of the columns in each row, it can be beneficial for the disk process to return a complete block of rows to the file system rather than returning one row at a time. This process is called real sequential block buffering (RSBB).
Improving Query Performance With Environmental Options Virtual Sequential Block Buffering (VSBB) average number of rows per block and average percent slack space per block. Note that it processes the whole table, so use with caution if the file is large. For more information, see the File Utility Program (FUP) Reference Manual. Guidelines for Choosing RSBB RSBB is used when the disk process can do only a minimal amount of filtering (selection and projection).
Improving Query Performance With Environmental Options Virtual Sequential Block Buffering (VSBB) VSBB reduces the number of messages between the file system and the disk process. Further, it reduces the amount of data transferred between the file system and the disk process. Figure 4-4.
Improving Query Performance With Environmental Options Virtual Sequential Block Buffering (VSBB) VSBB is used for UPDATE and DELETE operations if no more than one row is to be updated out of 32 rows examined. VSBB is not used for nonaudited tables if any of these is true: • • • The syncdepth value is greater than 0. A table lock is not used. The table contains alternate indexes, and a varying-length column is being updated.
Improving Query Performance With Environmental Options Comparison of Buffering Types into the table, a high percentage of lock waits and timeouts might occur. You can disable VSBB for insert operations by specifying this directive: CONTROL TABLE * SEQUENTIAL INSERT OFF Note. Disabling sequential insert or update operations does not automatically disable sequential read operations. You must specify CONTROL TABLE SEQUENTIAL READ OFF to disable sequential read operations.
Improving Query Performance With Environmental Options Requesting Buffering Requesting Buffering You can use the CONTROL TABLE directive to set buffering to ON, OFF, or ENABLE for sequential insert, read, or update operations. The ENABLE setting lets the system choose whether to use SBB or not for a specific query; this is the default for read operations. This example requests buffering for sequential UPDATE operations: CONTROL TABLE SALES.
Improving Query Performance With Environmental Options Controlling the Opening of Tables, Views, and Indexes Controlling the Opening of Tables, Views, and Indexes The NonStop SQL/MP file system opens (grants access to) objects when directed to do so by the NonStop SQL/MP executor. Tables, views, and indexes, and partitions of these objects, are usually opened on demand. SQLCI users or application programs do not influence when tables, views, and indexes are opened or closed.
Improving Query Performance With Environmental Options Controlling the Number of Key Columns Used by MDAM In addition to the open-on-demand feature, NonStop SQL/MP also ensures that tables are automatically created to provide local autonomy. Local autonomy ensures that you can access local data regardless of the availability of other local dependent objects or remote dependent objects, if the local data that is available can fully satisfy your request.
Improving Query Performance With Environmental Options Controlling the Creation of NonStop SQL/MP Processes If you choose a DENSE algorithm, the executor starts with 1 and increments the value sequentially to obtain the next values for column A. However, if the values for column A are 25, 135, 400, and 525, then DENSE would not be a good algorithm. Too many accesses would be made if each value were sequentially incremented until the next value were found.
Improving Query Performance With Environmental Options Enhancing Sort Performance Enhancing Sort Performance You can enhance the performance of sorts within SQL queries in several ways: • • • • • • Use subsorts, configured by specifying appropriate SUBSORT attributes in =_SORT_DEFAULTS DEFINEs. Direct FastSort to use additional memory when sorting data by setting the VLM attribute ON in a user-specified SORT DEFINE or a =_SORT_DEFAULTS DEFINE.
Improving Query Performance With Environmental Options Minimizing Overhead of Query Programs For information about limits on concurrency, see the SQL/MP Reference Manual. For information about maximizing concurrency during DDL operations, see the SQL/MP Installation and Management Guide.
Improving Query Performance With Environmental Options Minimizing Overhead of Query Programs HP NonStop SQL/MP Query Guide —524488-003 4- 34
5 Selectivity and Cost Estimates NonStop SQL/MP uses selectivity and cost when choosing an execution plan.
Computing Selectivity Selectivity and Cost Estimates • Join order The selectivity of each table helps determine the optimizer’s choice of the outer and inner table, because it helps determine cost. Cost and available access paths are the determining criteria for join order.
Predicate Selectivity Selectivity and Cost Estimates Predicate Selectivity Predicate selectivity is the estimated percentage of rows in a table or an index that satisfy a given predicate. The optimizer uses the selectivity of key predicates to estimate the number of rows to examine; it uses the selectivity of the rest of the predicates to estimate the number of rows that qualify. Those rows that actually satisfy the predicate are selected for processing at the next stage.
Predicate Selectivity Selectivity and Cost Estimates • column >= is treated as NOT column < constant For numeric ranges, the optimizer uses the constant to interpolate between the SECONDLOWVALUE and SECONDHIGHVALUE obtained by UPDATE STATISTICS.
Predicate Selectivity Selectivity and Cost Estimates Selectivity for Range Predicates When a range is used in a predicate, selectivity is calculated as follows: 1. The SECONDLOWVALUE is subtracted from the SECONDHIGHVALUE. The result is the total number of values in the range. For example, if the SECONDLOWVALUE is 0 and the SECONDHIGHVALUE is 1000, the result is 1000 minus 0, or 1000. 2.
Index Selectivity Selectivity and Cost Estimates NULL Values The presence of NULL values in a column is considered when determining selectivity of IS NULL predicates and range predicates and in determining index selectivity for pairs of range predicates. For example, the selectivity of C IS NULL is 0 percent if column C has the NOT NULL attribute If null values are present and the UEC for a column is two or less, SQL sets SECONDHIGHVALUE to the highest nonnull value in the table.
Table Selectivity Selectivity and Cost Estimates If index predicates are present, their selectivity is combined with the selectivity of the begin-key and end-key predicates. You can access this information using the EXPLAIN utility, under the titles “Pred. selectivity” and “Index selectivity.” Consider this query: SELECT ITEM_NAME, RETAIL_PRICE FROM INVNTRY WHERE ITEM_NO = 20 ; Suppose that there is a unique index on ITEM_NO in the INVNTRY table.
Selectivity and Cost Estimates Example Combining Predicate, Index, and Table Selectivity predicates in this order: index predicates, base-table predicates, executor predicates. The combined value is the estimated value for the table selectivity. Table selectivity strongly influences the estimated cost for subsequent steps of the query execution plan. For example, suppose that the estimated selectivity for a table with 1,000 rows is 20 percent. Consequently, it is expected that 200 rows will be retrieved.
Use of Default Selectivity Values Selectivity and Cost Estimates Note that the optimizer combines both EMP_START predicates to obtain a selectivity of .111. Finally, the optimizer uses the selectivity in steps: .0100 * .0100 * .3333 * .3333 .0000111 = index selectivity * .9900 * .3333 .00000367 = table selectivity Use of Default Selectivity Values The optimizer uses default selectivity values when • • • Statistics are not available because an UPDATE STATISTICS has not been performed.
Join Selectivity Selectivity and Cost Estimates Table 5-2. Computed and Default Selectivity Values for Predicates (page 2 of 2) Type of Predicate Computed Value Default Selectivity LIKE N. A 10 percent NOT LIKE N. A 30 percent EXISTS N. A 40 percent NOT EXISTS N.
Assigning Cost to a Query Selectivity and Cost Estimates The number of groups is then the product of the UECs of each grouping column, bounded by the number of rows being grouped. Default selectivity is used when statistics are not available. If the grouping element is an expression, the element’s selectivity is based on the underlying columns in the expression. Assigning Cost to a Query Cost is an estimate of the amount of time the system takes to complete evaluation of a specific query.
Cost of Accessing Tables Selectivity and Cost Estimates Cost of Accessing Tables The cost of using an index to access a table in a query that references only one table is as follows: Cost(index) = Cost(physical I/O) + Cost(record overhead) + Cost(data transfer) + Cost(message) + Cost(sub-query) + Cost(sort) If the cost of a component is less than one physical I/O, the cost of the component is truncated to zero.
Cost of Record Overhead Selectivity and Cost Estimates Cost of Record Overhead The cost of record overhead is the processor time, expressed in terms of physical I/Os, associated with handling rows. This estimate includes the cost of setting up various control blocks and is dependent on the number of rows examined.
Cost of Subqueries Selectivity and Cost Estimates Cost of Subqueries The cost of a subquery is estimated as if it were a query in its own right. The cost equals the cost of the subquery, multiplied by the number of times it must be executed. It is computed as the cost of index’, where index’ is the index chosen to execute the subquery. Cost also depends on whether a subquery is correlated or noncorrelated, as described in Section 1, Retrieving Data: How to Write Queries.
Cost of Join Operations Selectivity and Cost Estimates The cost of a sort depends on the type of sort required. In general, cost items include: • • • • • I/O cost to insert data into entry-sequenced file Scratch file cost Compare cost Temporary file creation Miscellaneous message and setup costs For an in-memory user process sort, only the compare cost is necessary. Sort costs for an ORDER BY are based on the number of rows going into the sort.
The Effects of Indexes and Predicates on Costs Selectivity and Cost Estimates After the cost of accessing each table in the join has been determined, the cost of the join can be determined. For a nested join of two tables, the cost is cost (a join b) = cost (a) + n x cost(b) N is the number of rows that satisfy the nonjoin predicates on table A. (N is the number of times the inner loop must be performed.
The Effect of the MultiDimensional Access Method (MDAM) on Costs Selectivity and Cost Estimates Table 5-3. Costs for Indexes With Predicates (page 2 of 2) Type of Access Approximate Index Cost Predicates Physical I/O Cost Primary key Equality predicates do not specify all key columns.
The Effect of the MultiDimensional Access Method (MDAM) on Costs Selectivity and Cost Estimates BETWEEN 5 AND 10, a positioning takes place for each value of B between 5 and 10. The number of positionings that take place for a predicate set are totaled as part of the cost. A positioning does not take place if a range is for the last-used key column. To estimate the cost of an IN predicate within a predicate set, the optimizer converts it to an IN predicate equivalent.
Evaluating Cost Estimates Selectivity and Cost Estimates Evaluating Cost Estimates When examining cost, these guidelines apply: • High cost indicates that the given query appears to be (and probably is) expensive. Always review high cost statements. Try to estimate how much I/O such a query should take and if it is consistent with that reported by EXPLAIN.
Forcing Execution Plans Selectivity and Cost Estimates • A table or index with a lower estimated cost In general, the optimizer attempts to choose a local table or index that has the least number of qualifying rows that must be examined. Forcing Execution Plans The goal of the optimizer is to generate a plan that works well on the average. Because of variations in applications and data, however, SQL sometimes chooses a plan that is not optimal.
Forcing Execution Plans Selectivity and Cost Estimates • • The query might not be able to benefit from future enhancements to SQL Changes to the database structure (such as dropping an index) can require recompilation when the option is in use Therefore, make any occurrences of it easy to find and change, using one or more of these alternatives: • • • Make sure the directive only applies to the statement and table intended. Return the specified table to SYSTEM method directly after the statement.
Forcing Execution Plans Selectivity and Cost Estimates HP NonStop SQL/MP Query Guide —524488-003 5- 22
6 Analyzing Query Performance Different queries have varying levels of impact on your system. One way to estimate query use is with the 90-10 rule, which estimates that 10 percent of the queries use up 90 percent of critical resources. The 90-10 rule can help you determine which queries are most important from a performance viewpoint. Note, however, that performance evaluation at the statement level should be done on a system that is as well-tuned as possible.
Guidelines for Tuning Queries Analyzing Query Performance • • • • • • EXPLAIN Plan for Cursor DELETE on page 6-73 EXPLAIN Plan for INSERT on page 6-74 EXPLAIN Plan for INSERT-SELECT on page 6-75 EXPLAIN Plan for UPDATE on page 6-77 EXPLAIN Plan With Date-Time Values on page 6-78 Comparing Cost: A Scenario on page 6-80 Guidelines for Tuning Queries When examining and tuning queries, use available tools such as these: • • • The DISPLAY STATISTICS command, which displays run time statistics about the las
Preparing Your Queries Analyzing Query Performance Preparing Your Queries Before you test your queries, you should prepare them using the SQLCI PREPARE command. The PREPARE command compiles an SQL statement and assigns a name to the statement. You can then reference the statement name to execute the statement multiple times without recompiling, and you can obtain an EXPLAIN plan for the compiled statement. These steps outline this procedure: 1.
Simple Query Example Analyzing Query Performance Simple Query Example Example 6-1 shows a simple query that selects all rows and columns from the EMPLOYEE table: Example 6-1. Simple Query 23> SQLCI >> PREPARE Q1 FROM +> SELECT * FROM EMPLOYEE ; --- SQL command prepared. >> EXECUTE Q1 ; The query returns the following result: EMPNUM FIRST_NAME LAST_NAME DEPTNUM JOBCODE SALARY ------ ------------- ------------- ------- ------- ---------1 ROGER GREEN 9000 100 175500.00 . . . . .
Simple Query Example Analyzing Query Performance • • • • ° Elapsed time includes the execution time, I/O time, and the time to display the result. ° Execution time is the amount of processor time used by the executor. The number of records accessed and the number of records used ° Records accessed is the number of rows read (including rows that do not satisfy the selection criteria). ° The rows are counted for each table, underlying table of a protection view, and temporary table.
Simple Query With ORDER BY Example Analyzing Query Performance Simple Query With ORDER BY Example Example 6-3 selects specific columns and orders them by salary: Example 6-3. Simple Query With ORDER BY >>PREPARE Q2 FROM +> SELECT LAST_NAME, FIRST_NAME, SALARY +> FROM EMPLOYEE +> ORDER BY SALARY ; --- SQL command prepared. >> >>EXECUTE Q2 ; LAST_NAME -------------------- FIRST_NAME --------------- DAY CHAPMAN . GREEN KATHRYN SUSAN . ROGER SALARY ----------12000.00 17000.00 . 175500.
Using Measure Analyzing Query Performance Using Measure You can use the Measure product to gather statistics on an SQL database and application programs. This subsection briefly describes the kinds of statistics that Measure provides. For more information on using Measure, see the SQL/MP Installation and Management Guide. The Measure product provides statistics on process execution and on individual statement execution.
Evaluating Measure Data Analyzing Query Performance compilation. An SQL section is generated for each SQL statement and is listed in the compilation output following the program code. The exception is for the statements on cursors: OPEN, FETCH, and CLOSE cursor statements. The counters of the OPEN, FETCH, and CLOSE cursor statements all contribute to the counter of the DECLARE CURSOR section number.
Generating an EXPLAIN Plan Analyzing Query Performance If a query has subqueries, the report shows additional steps in the same way for each subquery. You can use the information in the EXPLAIN plan for these types of tasks: • Assisting application program design; for example, the plan can help you ° ° • • Determine the access path to be chosen. Identify problems causing long response times, such as large sorts, full table scans, and correlated subqueries.
Generating an EXPLAIN Plan Analyzing Query Performance >> PREPARE Q2 FROM +> SELECT LAST_NAME, FIRST_NAME, SALARY +> FROM EMPLOYEE +> ORDER BY SALARY ; --- SQL command prepared. >> EXPLAIN PLAN FOR Q2 ; You can also specify EXPLAIN as part of the statement syntax: >> EXPLAIN +> SELECT LAST_NAME, FIRST_NAME, SALARY +> FROM EMPLOYEE +> ORDER BY SALARY ; A plan provides different types of information for different types of operations.
Generating an EXPLAIN Plan Analyzing Query Performance The total cost represents the cost of doing all the operations to complete the statement. In Example 6-4 on page 6-6, the DISPLAY STATISTICS command shows the estimated cost for this query is 3. By using EXPLAIN, you can see that the sort cost for this query is 1. By removing the ORDER BY clause (which causes the sort request), you can reduce the total cost of the query to 2, as shown in Example 6-6.
Interpreting an EXPLAIN Plan Analyzing Query Performance Interpreting an EXPLAIN Plan When you examine the EXPLAIN plan, look at all factors and information about the query. Use this information to assist you in tailoring queries so that they are efficient and return the information you require.
Interpreting an EXPLAIN Plan Analyzing Query Performance Table Scans Full table scans can affect performance adversely—especially if the table is quite large. To check for costly table scans, look for: • • Index selectivity of 100 percent No begin-key predicates, end-key predicates, or MDAM predicates Access Path Access can be by primary key or alternate index. Remember that an alternate index consists of all the columns defined for the index plus the column (or columns) that make up the primary key.
Interpreting an EXPLAIN Plan Analyzing Query Performance If the EXPLAIN plan indicates there are no key predicates, you might want to review the query and consider adding search conditions, based on leftmost key columns, that restrict the number of rows accessed, if feasible. For more information, see Writing Efficient Predicates on page 3-15. A lack of key predicates can also be a factor in causing a full table scan.
Interpreting an EXPLAIN Plan Analyzing Query Performance Selectivity The EXPLAIN plan lists selectivities for tables and indexes. Selectivity values influence the optimizer’s choice of the following: • Access path (base table, alternate index, or index only) For example, if the restriction specified by a WHERE predicate does not result in a low enough selectivity to justify alternate-index access, base-table access is chosen instead.
Reviewing Sample EXPLAIN Plans Analyzing Query Performance Options, provides a more thorough description of when parallel execution is and is not chosen.) If you are expecting parallel execution and it is not chosen, you might not have enabled parallel execution. You must enable parallel execution by specifying CONTROL EXECUTOR PARALLEL EXECUTION ON before SQL attempts to process a query or parts of a query in parallel.
EXPLAIN Plan for Simple SELECT Analyzing Query Performance Example 6-7. EXPLAIN Plan for Simple SELECT <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Select <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.0 Table : Scan : \SQL1.$DATA8.PERSNL.
EXPLAIN Plan for Primary Access Analyzing Query Performance EXPLAIN Plan for Primary Access Example 6-8 and Example 6-9 show different EXPLAIN plans for the same query: one plan uses primary access; the other uses index-only access.
EXPLAIN Plan for Index-Only Access Analyzing Query Performance Plan step 1 involves two operations: a scan of the EMPLOYEE table and a sort operation. Operation 1.0 is a scan of the EMPLOYEE table. The query requires that 3 out of 6 columns be retrieved from the table. The EXPLAIN plan for this operation includes the following: • • • • • • • The lock granularity is row (record). The access type is stable (the default). The lock mode is chosen by the system. The access path is the primary key.
EXPLAIN Plans for Bounded Predicates Analyzing Query Performance Example 6-9. EXPLAIN Plan Choosing Index-Only Access <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Select <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.0 Table : Scan : \SQL1.$DATA8.PERSNL.
Query With Lower Bound Analyzing Query Performance EXPLAIN SELECT LAST_NAME, FIRST_NAME, SALARY FROM EMPLOYEE WHERE SALARY >= 50000 ; The total cost of the query is 2. Example 6-10.
Query With Lower and Upper Bound Analyzing Query Performance Query With Lower and Upper Bound The query for the EXPLAIN plan in Example 6-11 specifies both a lower bound (>=50,000) and an upper bound (<=100,000): EXPLAIN SELECT LAST_NAME, FIRST_NAME, SALARY FROM EMPLOYEE WHERE SALARY >= 50000 AND SALARY <= 100000 ; The total cost of the query is 2. Example 6-11.
EXPLAIN Plan for Key Predicates Analyzing Query Performance EXPLAIN Plan for Key Predicates The EXPLAIN plan shows a query that specifies both a begin-key and an end-key predicate. The query retrieves data from the PARTS table. PARTNUM is the primary key. The query follows: EXPLAIN SELECT PARTNUM, PRICE, QTY_AVAILABLE FROM PARTS WHERE PARTNUM BETWEEN 5000 AND 7000 ; The total cost of the query is 1. Example 6-12.
EXPLAIN Plan for DISTINCT Analyzing Query Performance • Index and table selectivity are both approximately 26 percent. For the range predicates, selectivity is calculated as follows: 7000 - 5000 selectivity = -------------------------------SECONDHIGHVALUE - SECONDLOWVALUE • The selectivity is significantly lowered because of the addition of the upper bound; that is, the optimizer expects the number of rows returned will be smaller.
EXPLAIN Plan for ORDER BY Analyzing Query Performance EXPLAIN Plan for ORDER BY This query requires a sort operation to present columns in a specific order: EXPLAIN SELECT PARTNUM,PRICE,QTY_AVAILABLE FROM PARTS ORDER BY 2 DESC, 3 ASC, 1 ; The total cost of the query is 3. Example 6-14.
EXPLAIN Plans for GROUP BY Analyzing Query Performance The sort operation is requested explicitly in the query as a result of the ORDER BY clause. The purpose of the sort is to order columns in a specific order. The sort technique is FastSort.
SELECT With GROUP BY Using a Serial Plan Analyzing Query Performance Example 6-15. EXPLAIN Plan for SELECT With GROUP BY Using a Serial Plan (page 2 of 2) Access path 1 SBB for reads Begin key pred. End key pred. Index selectivity Index pred. Base table pred. : : : : : : : Primary Virtual None None Expect to examine 100% of rows from table None None Executor pred. Executor aggr.
SELECT With GROUP BY Using a Parallel Plan Analyzing Query Performance SELECT With GROUP BY Using a Parallel Plan This query creates an index on DEPTNUM and requests parallel execution: CONTROL EXECUTOR PARALLEL EXECUTION ON ; CREATE INDEX XDEPT ON EMPLOYEE (DEPTNUM,JOBCODE) PARTITION ( $DATA2.PERSNL.XDEPT FIRST KEY 4, $DATA5.PERSNL.XDEPT FIRST KEY 7); CONTROL TABLE EMPLOYEE ACCESS PATH INDEX XDEPT ; EXPLAIN SELECT DEPTNUM, JOBCODE, COUNT(*) FROM EMPLOYEE GROUP BY 1,2; Example 6-16.
EXPLAIN Plans for Subqueries Analyzing Query Performance Example 6-16. EXPLAIN Plan for SELECT With GROUP BY Using a Parallel Plan (page 2 of 2) Executor pred. Executor aggr. : None : Computed for each group COUNT ( * ) Table selectivity : Expect to select 100% of rows from table Expected row count: 57 rows after the scan Operation cost : 151 Operation 1.
Noncorrelated Subquery Analyzing Query Performance The total cost for the query is 2. Example 6-17. EXPLAIN Plan for Noncorrelated Subquery <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Select <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.0 : Scan Table : \SQL1.$DATA7.REG1.
Correlated Subquery Analyzing Query Performance The plan consists of two steps: • • Plan step 1 includes a scan of the ordone table. Plan step 2 includes a scan of the ordtwo table and is an evaluation of the subquery: SELECT B FROM ordtwo Plan step 2 executes once before plan step 1, which indicates that the subquery is evaluated once before the outer query and that this is a noncorrelated subquery.
Correlated Subquery Analyzing Query Performance Example 6-18. EXPLAIN Plan for Correlated Subquery (page 2 of 2) Executor pred. : On rows retrieved by the scan NOT ( EXISTS ( .. result of plan step 2 ) ) Pred.
EXPLAIN Plans for CASE Analyzing Query Performance EXPLAIN Plans for CASE This subsection contains EXPLAIN plans that show how the optimizer processes CASE. CASE With Multiple Conditions This query selects the last names, first names, and department numbers of employees with job codes of 100 whose salaries are less than 100,000 and also those employees with job codes of 600 whose salaries are less than 30,000.
CASE With Aggregates Analyzing Query Performance CASE With Aggregates This create statement and query appear under Computing Aggregates Based on Specific Conditions on page 1-45. create table emp (name char(8), age smallint NOT NULL, dept int, cars smallint NOT NULL, primary key name); select SUM(CASE SUM(CASE SUM(CASE SUM(CASE from emp; when when when when cars cars cars cars = 0 then 1 else = 1 then 1 else between 2 and 3 > 3 then 1 else 0 END), 0 END), then 1 else 0 END), 0 END) Example 6-20.
CASE for Finding the Highest Value in a Row Analyzing Query Performance CASE for Finding the Highest Value in a Row Following are the create statement and the query in Finding the Highest Value in a Row on page 1-47. The query retrieves the highest SAT scores from each row in the scores table. create table scores (name char(30), sat1 int NOT NULL, sat2 int NOT NULL, primary key name); select name, CASE when sat1 >= sat2 then sat1 else sat2 END from scores; Example 6-21.
Analyzing Query Performance CASE for Converting Long, Narrow Tables Into Short, Wide Ones CASE for Converting Long, Narrow Tables Into Short, Wide Ones Following are the create statement and the query in Converting Long, Narrow Tables Into Short, Wide Ones on page 1-49.
CASE for Converting Long, Narrow Tables Into Short, Wide Ones Analyzing Query Performance Example 6-22. EXPLAIN Plan for Converting Long, Narrow Tables Into Short, Wide Ones <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Select <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.
CASE for Ignoring the Largest and Smallest Values in a Set Analyzing Query Performance CASE for Ignoring the Largest and Smallest Values in a Set Following are the create statement and the query in Ignoring the Largest and Smallest Values in a Set on page 1-51. create table data (value primary key int NOT NULL, value); select x.value from data x, data y group by x.value having SUM (CASE when y.value <= x.value then 1 else 0 END) > 1 AND SUM (CASE when y.value >= x.
EXPLAIN Plans for String Functions Analyzing Query Performance Example 6-23. EXPLAIN Plan for Ignoring the Largest and Smallest Values in a Set (page 2 of 2) -------------------------------------------------------------------------Plan step 2 : Perform an Inner Join Join strategy : Nested Join Characteristic : Joins a row resulting from plan step 1 --------------------------------------------------------------------------Operation 2.0 Table : Scan : \SQL1.$DATA8.PUBS.
TRIM and CHAR_LENGTH Analyzing Query Performance Example 6-24. EXPLAIN Plan for SUBSTRING <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Select <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.0 Table : Scan : \SQL1.$DATA8.PERSNL.
TRIM and CHAR_LENGTH Analyzing Query Performance Example 6-25. EXPLAIN Plan for TRIM and CHAR_LENGTH <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Select <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.0 Table : Scan : \SQL1.$DATA8.PERSNL.
EXPLAIN Plans for MDAM Analyzing Query Performance EXPLAIN Plans for MDAM These examples show how the optimizer uses MDAM. MDAM With OR and Equality Predicate on Second Key Column In this example, an index is created on LAST_NAME, FIRST_NAME. The query specifies an OR on the first key column (LAST_NAME) and an equality predicate on the second key column (FIRST_NAME).
MDAM with Missing First Key Column Analyzing Query Performance MDAM with Missing First Key Column This creates an index on LAST_NAME, FIRST_NAME. The first key column is missing from the query predicate. CREATE INDEX XEMPNAME ON EMPLOYEE (LAST_NAME, FIRST_NAME); EXPLAIN SELECT * FROM EMPLOYEE WHERE FIRST_NAME = "Mary"; Example 6-27.
MDAM With IN List on Key Column Analyzing Query Performance MDAM With IN List on Key Column This creates an index on LAST_NAME, FIRST_NAME. The query uses MDAM to process an IN list for the LAST_NAME key column. CREATE INDEX XEMPNAME ON EMPLOYEE (LAST_NAME, FIRST_NAME); EXPLAIN SELECT * FROM EMPLOYEE WHERE LAST_NAME IN ("Marks","Jones") AND FIRST_NAME = "Mary"; Example 6-28.
MDAM With Multiple Predicate Sets, LIKE, and Missing Key Column Analyzing Query Performance An index is created on LAST_NAME, FIRST_NAME. CREATE INDEX XEMPNAME ON EMPLOYEE (LAST_NAME, FIRST_NAME); EXPLAIN SELECT * FROM EMPLOYEE WHERE ((LAST_NAME = "Marks" OR LAST_NAME = "Jones") AND FIRST_NAME = "Mary") OR (FIRST_NAME LIKE "Ha%"); Example 6-29.
Analyzing Query Performance EXPLAIN Plan for Determining the Cost of Multiple Predicate Sets EXPLAIN Plan for Determining the Cost of Multiple Predicate Sets This query contains a predicate that SQL converts into multiple predicate sets. An index is created for DEPTNUM and JOBCODE.
EXPLAIN Plan for Selectivity for Range Predicates Analyzing Query Performance Example 6-30. EXPLAIN Plan for Determining the Cost of Multiple Predicate Sets (page 2 of 2) Pred. selectivity : Expect to select 30.5861% of rows from index Base table pred. : Will be evaluated by the disk process ( ( SALARY = 50000 ) AND ( JOBCODE = 450 ) ) OR ( ( ( JOBCODE = 500 ) OR ( JOBCODE = 600 ) ) AND ( ( DEPTNUM = 4000 ) OR ( DEPTNUM = 7000 ) ) ) Pred. selectivity : Expect to select 30.
EXPLAIN Plans for Join Queries Analyzing Query Performance Example 6-31. EXPLAIN Plan for Selectivity for Range Predicates <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Select <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.0 Table : Scan : \SQL1.$DATA8.PERSNL.
Parallel Execution of Hash Join Analyzing Query Performance For more information on ESPs, see Processor Assignment by the SQL/MP Optimizer and Executor for Executor Server Processes (ESPs) on page 2-5. Parallel Execution of Hash Join The EXPLAIN plan in Example 6-32 chooses parallel execution for a join operation. SQL chooses the hash join method to join the tables. The hash join method is often chosen when joining a large table and a much smaller table.
Parallel Execution of Hash Join Analyzing Query Performance Example 6-32. EXPLAIN Plan for Hash Join (page 2 of 3) --------------------------------------------------------------------------Plan step 2 : Perform an Inner Join Join strategy : Hybrid Hash Join Each operation is performed in parallel for this step Each ESP will read one of the following partitions: \SQL1.$DATA3 \SQL1.$DATA4 \SQL1.
Nested Inner Join Analyzing Query Performance Example 6-32. EXPLAIN Plan for Hash Join (page 3 of 3) Operation 2.1 : Hash Requested : Hash rows in the : Purpose : Hash key columns : Hash cost : By the optimizer Join composite excluding current table To hash its rows before the Join A.TENPCT 126 Operation 2.2 : Hash Requested : Hash rows in the : Purpose : Hash key columns : Hash cost : By the optimizer Current table To hash its rows before the Join B.
Nested Inner Join Analyzing Query Performance Example 6-33. EXPLAIN Plan for Nested Inner Join (page 1 of 2) <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 : Will utilize parallel execution SQL request : Select <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 Each operation is performed in parallel for this step Each ESP will read one of the following partitions: \SQL1.$DATA3 \SQL1.
Cross Product Join Analyzing Query Performance Example 6-33. EXPLAIN Plan for Nested Inner Join (page 2 of 2) --------------------------------------------------------------------------Operation 2.0 Table : Scan : \SQL1.$DATA2.WISCREG.TENKTUP2 with correlation name B Access type : Record locks, stable access Lock mode : Chosen by the system Column processing : Requires retrieval of 16 out of 16 columns Access path 1 SBB for reads Begin key pred. End key pred. Index selectivity Index pred. Base table pred.
Cross Product Join Analyzing Query Performance Example 6-34. EXPLAIN Plan for Cross Product Join (page 1 of 2) <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Select <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.0 Table : Scan : \SQL1.$DATA2.WISCREG.
Cross Product Join Analyzing Query Performance Example 6-34. EXPLAIN Plan for Cross Product Join (page 2 of 2) Access path 1 : Primary, partitioned, sequential cache SBB for reads : Virtual Begin key pred. : None End key pred. : None Index selectivity : Expect to examine 100% of rows from table Index pred. : None Base table pred. : Will be evaluated by the disk process B.TWENTY = C.TWENTY Pred. selectivity : Expect to select 5% of rows from table Executor pred.
Parallel Execution of Nested Inner Join Analyzing Query Performance Parallel Execution of Nested Inner Join This example and the following merged join example both use the query from Example 6-32 on page 6-49, but each uses a different CONTROL TABLE directive: CONTROL TABLE * JOIN METHOD NESTED; EXPLAIN SELECT * FROM TENKTUP1 A, TENKTUP2 B WHERE A.TENPCT = B.TENPCT ; Example 6-35.
Parallel Execution of Nested Inner Join Analyzing Query Performance Example 6-35. EXPLAIN Plan for Nested Inner Join (page 2 of 2) --------------------------------------------------------------------------Plan step 2 : Perform an Inner Join Join strategy : Nested Join Plan Forced : Join Method forced by user directive Each operation is performed in parallel for this step Each ESP from previous step will join one of the following partitions: \SQL1.$DATA3 \SQL1.$DATA4 \SQL1.
Parallel Execution of Forced Merged Inner Join Analyzing Query Performance Parallel Execution of Forced Merged Inner Join This EXPLAIN plan chooses parallel execution for an inner join query. The merge join method was chosen to join the tables. The query for this example is the same as the one for Example 6-32 on page 6-49, but the CONTROL TABLE directive differs: CONTROL TABLE * JOIN METHOD MERGE; EXPLAIN SELECT * FROM TENKTUP1 A, TENKTUP2 B WHERE A.TENPCT = B.
Parallel Execution of Forced Merged Inner Join Analyzing Query Performance Example 6-36. EXPLAIN Plan for Parallel Execution of Sort Merge Inner Join (page 2 of 3) --------------------------------------------------------------------------Plan step 2 : Perform an Inner Join Join strategy : Merge Join Plan Forced : Join Method forced by user directive Each operation is performed in parallel for this step Each ESP will read one of the following partitions: \SQL1.$DATA3 \SQL1.$DATA4 \SQL1.
Key-Sequenced Merge Join Analyzing Query Performance Example 6-36. EXPLAIN Plan for Parallel Execution of Sort Merge Inner Join (page 3 of 3) Operation 2.2 : Sort Requested : Sort rows in the : Purpose : Sort technique : Sort type : Sort key columns : Sort cost : Total cost By the optimizer Current table To order its rows before the Join FASTSORT Insertion into an entry-sequenced disk file A.
Key-Sequenced Merge Join Analyzing Query Performance CONTROL TABLE * JOIN METHOD KEY SEQUENCED MERGE; EXPLAIN SELECT X.ORDER_DATE, X.ORDERNUM, COUNT(*) FROM ORDERS X,ODETAIL Y WHERE X.ORDERNUM = Y.ORDERNUM GROUP BY X.ORDERNUM,X.ORDER_DATE ORDER BY X.ORDER_DATE,X.ORDERNUM ; Example 6-37.
Key-Sequenced Merge Join With Executor Aggregates Analyzing Query Performance Example 6-37. EXPLAIN Plan for Key-Sequenced Merge Join (page 2 of 2) Executor pred. : Table selectivity : Expected row count: Operation cost : Operation 2.1 : Sort Requested : Sort rows in the : Purpose : Sort technique : Sort type : UPS workspace : Sort key columns : Expected row count: Sort cost : Total cost None Expect to select 7.
Key-Sequenced Merge Join With Executor Aggregates Analyzing Query Performance Example 6-38. EXPLAIN Plan for Key-Sequenced Merge Join With Executor Aggregates <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Select <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.
Left Join Not Transformed Into an Inner Join Analyzing Query Performance This plan contains two steps. Step 2 shows that the key-sequenced merge join method was chosen. Executor aggregates were chosen because aggregates cannot be done in DP2 on the inner table of a key-sequenced merge join. In plan step 1, the access path is alternate, index only, sequential cache. The operation cost is 2187. In plan step 2, the access path is alternate, unique, index only, sequential cache. No sort takes place.
Left Join Not Transformed Into an Inner Join Analyzing Query Performance Example 6-39. EXPLAIN Plan for Left Join Not Transformed Into an Inner Join (page 1 of 2) <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Select <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< -------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.
Left Join Transformed Into an Inner Join Analyzing Query Performance Example 6-39. EXPLAIN Plan for Left Join Not Transformed Into an Inner Join (page 2 of 2) Operation 2.1 : Hash Requested : Hash rows in the : Purpose : Hash key columns : Hash cost : By the optimizer Current table To hash its rows before the Join O.SALESREP 1 Operation 2.
Left Join Transformed Into an Inner Join Analyzing Query Performance Example 6-40. EXPLAIN Plan for Left Join Transformed Into an Inner Join <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Select <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.0 Table : Scan : \SQL1.$DATA8.SALES.
EXPLAIN Plan for UNION Operation Analyzing Query Performance The IS NOT NULL predicate in the query eliminates the special null-augmented rows produced by the left join operator. The search condition retains only the joined rows that are the same as those that result for an inner join. No performance gain results if SQL generates null-augmented rows that will be discarded, so Query Rewrite changes the left join to an inner join. The EXPLAIN plan shows that plan step 2 is an inner join.
Analyzing Query Performance EXPLAIN Plan for UNION Operation Example 6-41. EXPLAIN Plan for UNION Operation (page 1 of 2) <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Union of Selects <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 : perform a Union --------------------------------------------------------------------------Operation 1.
EXPLAIN Plan for MAX Optimization Analyzing Query Performance Example 6-41. EXPLAIN Plan for UNION Operation (page 2 of 2) Executor pred. : Table selectivity : Expected row count: Operation cost : Total cost None Expect to select 100% of rows from table 1 row after the scan 1 : 3 Plan step 1 is a union of the result of plan steps 2 and 3. Plan step 1 executes after plan steps 2 and 3. Plan step 1 involves a sort operation to eliminate duplicate rows in the union result.
EXPLAIN Plan for MAX Optimization Analyzing Query Performance Example 6-42. EXPLAIN Plan for MAX Optimization <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Select <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.0 Table : Scan : \SQL1.$DATA8.PERSNL.
EXPLAIN Plan for Cursor UPDATE Analyzing Query Performance EXPLAIN Plan for Cursor UPDATE This EXPLAIN plan shows an UPDATE operation using a cursor. The query updates the EMPLOYEE table (primary key EMPNUM) according to data in the TABLES catalog table (primary key TABLENAME). The query follows: EXPLAIN UPDATE EMPLOYEE SET JOBCODE = 105 WHERE EXISTS ( SELECT TABLENAME FROM TABLES WHERE TABLETYPE = 'VI' ); The total cost of the query is 54. Example 6-43.
EXPLAIN Plan for Cursor DELETE Analyzing Query Performance Example 6-43. EXPLAIN Plan for Cursor UPDATE (page 2 of 2) Executor pred. : On rows retrieved by the scan EXISTS ( .. result of plan step 2 ) Pred.
EXPLAIN Plan for INSERT Analyzing Query Performance The total cost of the query is 1. Example 6-44. EXPLAIN Plan for Cursor DELETE <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Delete <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.0 Table : Scan : \SQL1.$DATA8.PERSNL.
EXPLAIN Plan for INSERT-SELECT Analyzing Query Performance Example 6-45. EXPLAIN Plan for INSERT Statement <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Insert <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.0 : Insert Table : \SQL1.$DATA7.REG1.
EXPLAIN Plan for INSERT-SELECT Analyzing Query Performance Example 6-46. EXPLAIN Plan for INSERT With SELECT <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Insert-Select <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.0 Table : Scan : \SQL1.$DATA7.REG1.
EXPLAIN Plan for UPDATE Analyzing Query Performance EXPLAIN Plan for UPDATE This example shows the EXPLAIN plan for an UPDATE operation. The query updates the EMPLOYEE table as follows: EXPLAIN UPDATE employee SET deptnum = 5 WHERE empnum = ?parm3; The total cost of the query is 1. Example 6-47.
EXPLAIN Plan With Date-Time Values Analyzing Query Performance EXPLAIN Plan With Date-Time Values These examples show EXPLAIN plans for an UPDATE operation. If no end-date-time is provided for an INTERVAL data type and is implied for a startdate-time, SQL expands the original syntax of the query to show the implied end-datetime. HOUR Date-Time Values Example 6-48 on page 6-79 shows HOUR(2) expanded to HOUR. The query used in this example is as follows. INVOKE B2UNS01 ; Definition of table \SQL1.$DATA5.
DAY Date-Time Values Analyzing Query Performance Example 6-48. EXPLAIN Plan With HOUR Date-Time Values <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Select <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.0 Table : Scan : \SQL1.$DATA5.SQLDOPTS.
Comparing Cost: A Scenario Analyzing Query Performance WHERE INT0_DTOF6_UNIQ >= INTERVAL '0' DAY(3) AND INT0_DTOF6_UNIQ <= INTERVAL '2' DAY(3) ; Example 6-49.
First Formulation Analyzing Query Performance After executing the query, use the DISPLAY STATISTICS command to display the statistics as shown in Example 6-50 The estimated cost is 50. The ODETAIL table is scanned once. The PARTS table is scanned twice. Example 6-50. DISPLAY STATISTICS Output for QUERY1 Estimated Cost Start Time End Time Elapsed Time SQL Execution Time 50 95/09/11 09:08:13.871517 95/09/11 09:08:14.546697 00:00:00.675180 00:00:00.229874 Records Table Name Accessed \SQL1.$DATA8.SALES.
First Formulation Analyzing Query Performance Example 6-51. EXPLAIN Plan for QUERY1 (page 1 of 2) <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Query plan 1 SQL request : Select <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< --------------------------------------------------------------------------Plan step 1 --------------------------------------------------------------------------Operation 1.0 Table : Scan : \SQL1.$DATA8.SALES.
Second Formulation Analyzing Query Performance Example 6-51. EXPLAIN Plan for QUERY1 (page 2 of 2) --------------------------------------------------------------------------Plan step 3 Characteristic : Executes once per row retrieved in plan step 2 --------------------------------------------------------------------------Operation 3.0 Table : Scan : \SQL1.$DATA8.SALES.
Second Formulation Analyzing Query Performance After executing the query, use the DISPLAY STATISTICS command to display the statistics shown in Example 6-52 on page 6-84. Example 6-52. DISPLAY STATISTICS Output for QUERY2 Estimated Cost Start Time End Time Elapsed Time SQL Execution Time 4 95/09/11 09:08:17.871610 95/09/11 09:08:18.010648 00:00:00.139038 00:00:00.016799 Records Table Name Accessed \SQL1.$DATA8.SALES.PARTS 1 \SQL1.$DATA8.SALES.
Second Formulation Analyzing Query Performance Example 6-53. EXPLAIN Plan for QUERY2 (page 2 of 2) --------------------------------------------------------------------------Plan step 2 Characteristic : Executes once before plan step 1 --------------------------------------------------------------------------Operation 2.0 Table : Scan : \SQL1.$DATA8.SALES.
Second Formulation Analyzing Query Performance HP NonStop SQL/MP Query Guide —524488-003 6- 86
Index A Access option 4-16 Access path alternate index 4-5 base table, through an index 4-6 description of 4-4 displaying with EXPLAIN 6-13 index-only 4-6 primary key 4-5 table scan 4-4 Aggregate functions AVG 1-41 CASE and 1-45 conditional 1-45 COUNT 1-41 description of 1-41 evaluating 3-46, 3-47, 6-83 MAX 1-41 MIN 1-41 optimizing 3-46 SUM 1-41 ALL predicate 1-38 ANY predicate 1-38 Assignment, processor 2-5 AVG function 1-41 B Base table accessing through an index 4-6 accessing through primary key 4-5 pre
C Index Comparison predicate description of 1-31 multivalued 1-40 operators 1-31 COMPILE option 4-33 Composite table 3-34 Concatenation operator 1-20 Concurrency, optimizing 4-32 Conditional aggregates 1-45 Consistency 4-16 Constant expression 3-10 Constants, propagation of 3-10 Context-free servers 1-41 CONTROL EXECUTOR directive 4-13 CONTROL QUERY directive BIND NAMES option 4-33 EXPLAIN plan and 6-13 HASH JOIN option 3-43 INTERACTIVE ACCESS option 4-6, 4-8 CONTROL TABLE directive ACCESS PATH option 4-1
D Index D Data date-time type 1-21 density 4-30, 5-18 distribution 4-9, 5-2 manipulation statements 1-2 missing, using left join to reveal 1-55 repartitioned 2-5 retrieval from two tables 1-51 of multiple rows 1-12 ordering the results of 1-5 SELECT statement and 1-2 single row and 1-10 specifying search conditions for 1-8 with cursor 1-12 sparsity 4-30, 5-18 transfer, cost of 5-13 Data distribution, uneven access 4-9 selectivity 5-2 Database designing 4-1 searching 1-8 DATEFORMAT function 1-24, 1-26 Date
F Index EXECUTE statement 3-3 Execution statistics, displaying 6-3 Execution-time name resolution 4-33 Executor 2-3 Executor predicate description of 3-15 EXPLAIN plan and 3-12 EXPLAIN plan evaluation 6-14 list of 3-15 Executor server process (ESP) description 4-14 processor assignment 2-5 EXISTS predicate description of 1-37 performance implication 3-16 EXPLAIN examples bounded predicates in 6-20 CASE for converting tables 6-36 for finding the highest value in a row 6-35 for ignoring largest and smallest
G Index FROM clause of SELECT statement 1-2 Full table scan, avoiding 3-54, 4-4 Functions aggregate 1-41 AVG 1-41 COUNT 1-41 CURRENT 1-25 DATEFORMAT 1-26 date-time 1-24 DAYOFWEEK 1-26 EXTEND 1-27 JULIANTIMESTAMP 1-27 MAX 1-41 MIN 1-41 SUBSTRING 1-15 SUM 1-41 G Granularity 4-19 GROUP BY clause efficient use of 3-47, 3-54 EXPLAIN plan for 6-26, 6-28 MDAM and 3-49 of SELECT statement 1-4, 1-7 parallel plans and 3-49 selectivity 5-10 serial plans and 3-48 UNION operation and 1-64 with DISTINCT 3-58 with ORDE
J Index INSERT statement EXPLAIN plan for 6-74 for modifying data 1-2 INSERT-SELECT, EXPLAIN plan for 6-75 Integrity of data 4-16 INTERVAL data type 1-21, 1-22 INTO clause 1-11 J Join methods See also Join operation See also Join predicate evaluation by optimizer 3-34 hash join 3-30 key-sequenced merge join 3-27 nested join 3-25 relative performance 3-36 sort merge join 3-25 Join operation cost of 5-15 description of 1-51 evaluating 3-24 EXPLAIN plans for 6-48 inner 1-53, 1-54 left (outer) 1-53, 1-55 met
M Index Lock mode description of 4-16 exclusive 4-19 shared 4-19 LOCK TABLE statement 4-19 Locks access options 4-16 characteristics of 4-20 description of 4-16 duration of 4-16, 4-20 exclusive 4-19 EXPLAIN plan and 6-15 granularity of 4-19 mode 4-16, 4-19 shared 4-19 M MAX function description of 1-41 EXPLAIN plan for 6-70 optimizing 3-46 MDAM (MultiDimensional Access Method) CONTROL QUERY directive 4-7 CONTROL TABLE directive 4-12 controlling the dense or sparse algorithm 4-30 controlling the number of
P Index ON clause in join operations 1-59, 1-60 search conditions and 1-9 Online transaction processing (OLTP) considerations 3-62 OPEN cursor statement 1-12 Operators for predicates 1-30 Optimization See Performance Optimizer choosing an execution plan 2-3 determining cost with 5-11, 6-12 join strategy 3-34 OR operator description of 1-35 optimizing 3-22 specifying indexes for improved performance with 3-22 ORDER BY clause EXPLAIN plan for 6-10, 6-25 of SELECT statement 1-4, 1-5 UNION operation and 1-63
Q Index Predicates (continued) ANY 1-38 base table 3-14 BETWEEN 1-32 comparison 1-31 connected by OR operators 1-35 controlling generation of additional 3-20 cost of 5-16 efficient writing of 3-15 eliminating 3-11 evaluating 3-4 executor 3-15 EXISTS 1-37 IN 1-36, 3-7 index 3-13 join 3-18, 3-40 key 3-13 LIKE description of 1-33 efficiency and 3-24 with CHAR columns 1-34 with columns of varying length 1-34 with TRIM 1-35 missing key columns 3-16 multivalued 1-40, 3-19 NULL 1-39 operators in 1-30 OR operator
R Index Query execution plan (continued) generating 6-9 interpreting 6-12 Query optimization See also Performance CONTROL EXECUTOR directive 4-13 CONTROL TABLE directive 4-28 cursor operations and 4-28 formulating queries for 3-1 keeping statistics current for 4-2 specifying appropriate access type for 4-17 specifying buffered operations for 4-28 Query Rewrite description of 3-9 for propagating constants 3-10 left joins and 3-9, 6-66, 6-68 using 3-8 R Range predicate description of 1-32 MDAM and 3-13 per
S Index SELECT statement (continued) single-row 1-10, 3-60 WHERE clause 1-9 Selectivity and EXPLAIN plan 6-15 computing description of 5-2 when a range is used 5-5 when no range is used 5-4 default values for 5-9 definition of 5-1 index 5-6 predicate 5-3 statistics for estimating 5-2 table 5-7 Self join 3-42 Sequential access 4-28 Sequential block buffering (SBB) See SBB Sequential cache, EXPLAIN plan and 6-14 Sequential table scan 4-5 SET STATISTICS ON option 6-4 Shared lock 4-19 SHARED mode, LOCK TABLE
T Index Subquery (continued) evaluating 3-51 noncorrelated 1-29, 3-53 nonquantified 1-29 quantified 1-29 SUBSTRING function data types for results 1-16 empty string results 1-17 errors 1-17 EXPLAIN plan for 6-39 in a query 1-16 null results 1-17 to extract part of a string 1-15 SUM function 1-41 T Tables controlling opening of 4-29 cost of accessing 5-12 full scans of, avoiding 3-54, 4-4 inner 3-25 joining 1-51 outer 3-25 scanning 4-4 scans and EXPLAIN 6-13 selectivity of 5-1, 5-7 TIME data type 1-21 Tim