SQL/MX 2.x Reference Manual (G06.24+, H06.03+)

ManualsBrandsHP ManualsServerHP NonStop G-Series

711

712

713

714

715

716

717

718

719

720

SQL/MX Clauses

HP NonStop SQL/MX Reference Manual—523725-004

7-10

Considerations for SAMPLE

You can determine the actual size of the sample. Suppose that there are N

rows in the intermediate result table. If the size s of the sample is specified as

a number of rows, the actual size of the resulting sample is the minimum of s

and N.

PERIODIC rows-size EVERY number-rows ROWS

[SORT BY colname [ASC[ENDING] | DESC[ENDING]]

[,colname [ASC[ENDING] | DESC[ENDING]]]...]

directs NonStop SQL/MX to choose the first rows from each block (or period) of

contiguous rows. This sampling method is equivalent to a separate FIRST

sampling for each period, and the rows-size is defined as in FIRST sampling.

The size of the period is specified as a number of rows. You can specify the order

of the rows to sample. Otherwise, NonStop SQL/MX chooses an arbitrary order.

You can determine the actual size of the sample. Suppose that there are N rows in

the intermediate result table. If the size s of the sample is specified as a number of

rows and the size p of the period is specified as a number of rows, the actual size

of the resulting sample is calculated as:

FLOOR (N/p) * s + minimum (MOD (N, p), s)

Note that minimum in this expression is used simply as the mathematical minimum

of two values.

Considerations for SAMPLE

Sample Rows

In general, when you use the SAMPLE clause, the same query returns different sets of

rows for each execution. The same set of rows is returned only when you use the

FIRST and PERIODIC sampling methods with the SORT BY option, where there are

no duplicates in the specified column combination for the sort.

Cluster Sampling

Cluster sampling is an option supported by the SAMPLE RANDOM clause in a

SELECT statement. A cluster, in this sense, is a logically contiguous set of disk blocks

in the file in which a table is stored. The number of blocks in a cluster is specified in the

CLUSTERS subclause of the SAMPLE RANDOM clause. For example:

SELECT * FROM customers

SAMPLE RANDOM 1 PERCENT

CLUSTERS OF 2 BLOCKS;

This query randomly selects one percent of the clusters in the CUSTOMERS table and

then adds each row in all selected clusters to the result table. In other words, think of

the CUSTOMERS table as a sequence of disk blocks, where each two blocks in the

sequence is a cluster. The preceding query selects one percent of the clusters at

random and then returns all the rows in each selected cluster.