SQL/MX Data Mining Guide
Introduction
HP NonStop SQL/MX Data Mining Guide—523737-001
1-2
The SQL/MX Approach
algorithms that load the entire data set into memory and perform necessary 
computations.
The extract approach has two major limitations:
•
It does not scale to large data sets because the entire data set is required to fit in 
memory. Statistical sampling can be used to avoid this limitation. However, 
sampling is inappropriate in many situations because sampling might cause 
patterns to be missed, such as those in small groups or those between records.
•
It cannot conveniently manage multiple versions of data across numerous 
iterations of a typical knowledge discovery investigation. For example, each 
iteration might require extracting additional data, performing incremental updates, 
deriving new attributes, and so on.
The SQL/MX Approach
In most enterprise organizations today, database systems are crucial for conducting 
business. The DBMS systems serve as the transaction processing systems for daily 
operations and manage data warehouses containing huge amounts of historical 
information. The validated data in these warehouses is already being used for online 
analysis and is a natural starting point for knowledge discovery.
The SQL/MX approach identifies fundamental data structures and operations that are 
common across a wide range of knowledge discovery tasks and builds such structures 
and operations into the DBMS. The primary advantages of the SQL/MX technology 
over traditional data mining techniques include:
•
The ability to mine much larger data sets, not only data in flat-file extracts
•
Simplified data management
•
More complete results 
•
Better performance and reduced cycle times
The main features of the SQL/MX approach are summarized next.
Data-Intensive Computations Performed in the DBMS
Tools and applications perform data-intensive data-preparation tasks in the DBMS by 
using an SQL interface. As a result, you can access the powerful and parallel DBMS 
data manipulation capabilities in the data preparation stage of the knowledge discovery 
process. 
Use of Built-In DBMS Data Structures and Operations
Fundamental data structures and operations are built into the DBMS to support a wide 
range of knowledge discovery tasks and algorithms in an efficient and scalable 
manner.










