SQL/MX Data Mining Guide
Mining the Data
HP NonStop SQL/MX Data Mining Guide—523737-001
4-6
Building Decision Trees
-------------------- --- ------ --------- --------
GENDER F ? Y 2
GENDER M ? Y 1
NUMBER_CHILDREN ? 0 Y 3
--- 3 row(s) selected.
The preceding query shows split results for Gender when Cust_Left is equal to Y, and
therefore Gender is not a good predictor when Marital Status equal to Single. However,
the query also shows Number_Children equal to 0 when Cust_Left is equal to Y, and
therefore Number_Children is a good predictor.
Decision Tree for Single Branch
Figure 4-3 shows the results of the preceding query for the example business
opportunity.
For Single, when Cust_Left equal to Y, the number of records is 3 for Number_Children
equal to 0. Number_Children best discriminates the goal when Marital Status is equal
to Single.
Conditions Defining the Decision Tree
The model developed so far seems to characterize the customers that have left—that
is, the model finds the rows where Cust_Left equal to Y. The model is now defined by
two conditions:
(marital_status = 'Divorced' AND gender = 'M')
(marital_status = 'Single' AND number_children = 0)
For Divorced and Male, the number of records is 5 for Cust_Left equal to Y, and the
number of records is 0 for Cust_Left equal to N. For Single and Number_Children, the
number of records is 3 for Cust_Left equal to Y, and the number of records is 0 for
Cust_Left equal to N.
Figure 4-3. Decision Tree for Single Branch
Marital Status
Single
No Yes
0 3
Married
No Yes
2 1
Widow
No Yes
1 1
Divorced
No Yes
1 5
Chldrn=0
No Yes
0 3
Chldrn>0
No Yes
0 0