SQL/MX 2.x Reference Manual (G06.24+, H06.03+)
SQL/MX Clauses
HP NonStop SQL/MX Reference Manual—523725-004
7-9
SAMPLE Clause
You can determine the actual size of the sample. Suppose that there are N 
rows in the intermediate result table. Each row is picked with a probability of 
r%, where r is the sample size in PERCENT. Therefore, the actual size of the 
resulting sample is approximately r% of N. The number of rows picked follows 
a binomial distribution with mean equal to r * N/100.
If you specify a sample size greater than 100 PERCENT, NonStop SQL/MX 
returns all the rows in the result table plus duplicate rows. The duplicate rows 
are picked from the result table according to the specified sampling method. 
This technique is referred to as oversampling and is not allowed with cluster 
sampling.
ROWS
specifies row sampling. Row sampling is the default if you specify neither 
ROWS nor CLUSTERS.
CLUSTERS OF number-blocks BLOCKS
specifies cluster sampling. You can use the CLUSTERS clause for a base 
table only if there is no WHERE clause in the SELECT statement. First, a 
cluster is chosen randomly, and then all rows in the selected cluster are 
added to the sample. The size of the cluster is determined by number-
blocks. This process is repeated until the sample size is generated. See 
Cluster Sampling on page 7-10.
BALANCE
If you specify a BALANCE expression, NonStop SQL/MX performs 
stratified sampling. The intermediate result table is divided into disjoint 
strata based on the WHEN conditions. Each stratum is sampled 
independently by using the sampling size. For a given row, the stratum to 
which it belongs is determined by the first WHEN condition that is true for 
that row—if there is a true condition. If there is no true condition, the row 
belongs to the ELSE stratum. 
FIRST rows-size [SORT BY colname [ASC[ENDING] | DESC[ENDING]] 
 [,colname [ASC[ENDING] | DESC[ENDING]]]...]
directs NonStop SQL/MX to choose the first rows from the result table. You can 
specify the order of the rows to sample. Otherwise, NonStop SQL/MX chooses an 
arbitrary order. The sampling size is determined by the rows-size, defined as:
 number-rows ROWS 
| BALANCE WHEN condition THEN number-rows ROWS 
 [WHEN condition THEN number-rows ROWS]...
 [ELSE number-rows ROWS] END 
specifies the value of the size for FIRST sampling by using the number of rows 
intended in the sample. The value number-rows must be an integer literal.










