SQL/MX Data Mining Guide
Mining the Data
HP NonStop SQL/MX Data Mining Guide—523737-001
4-3
Building Decision Trees
GENDER M ? Y 6 
MARITAL STATUS Divorced ? N 1 
MARITAL STATUS Divorced ? Y 5 
MARITAL STATUS Married ? N 2 
MARITAL STATUS Married ? Y 1 
MARITAL STATUS Single ? Y 3 
MARITAL STATUS Widow ? N 1 
MARITAL STATUS Widow ? Y 1 
NUMBER_CHILDREN ? 0 N 2 
NUMBER_CHILDREN ? 0 Y 5 
NUMBER_CHILDREN ? 1 Y 2 
NUMBER_CHILDREN ? 2 N 1 
NUMBER_CHILDREN ? 2 Y 3 
NUMBER_CHILDREN ? 3 N 1 
--- 17 row(s) selected.                
Determining Which Attribute Best Predicts the Goal
Consider the results of the preceding query. You are ready to determine which of the 
independent variables best predicts the dependent variable (the goal).
Examine the rows for each independent variable in the query. If most of the rows for a 
particular value of an independent variable correlate with Cust_Left equal to Y, that 
independent variable is a good predictor of the goal. This type of analysis is typically 
performed by client-mining tools.
Both Gender and Marital Status are reasonable choices as the best predictor of the 
goal. To carry out the remaining cross-table generations, this scenario uses Marital 
Status as the best predictor for the initial branch of the decision tree.
Independent 
Variable
Predictor? Reason
GENDER Yes When Cust_Left equal to Y, the Gender is 
predominantly equal to M. The number of 
Males is 6, and the number of Females is 4. 
MARITAL STATUS Yes When Cust_Left is equal to Y, the Marital 
Status is predominantly equal to Divorced and 
Single. The number of Divorced is 5, the 
number of Married is 1, the number of Single 
is 3, and the number of Widow is 1. 
NUMBER CHILDREN No When Cust_Left is equal to Y, the 
Number_Children is 0, 1, and 2. The number 
with Children=0 is 5, the number with 
Children=1 is 2, and the number with 
Children=2 is 3. The values do not show a 
pattern and do not predict Cust_Left equal to 
Y.










