SQL/MX Data Mining Guide

Mining the Data
HP NonStop SQL/MX Data Mining Guide523737-001
4-5
Building Decision Trees
GENDER M ? Y 5
NUMBER_CHILDREN ? 0 N 1
NUMBER_CHILDREN ? 0 Y 1
NUMBER_CHILDREN ? 1 Y 1
NUMBER_CHILDREN ? 2 Y 3
--- 6 row(s) selected.
The preceding query shows Gender is Male in all cases where Cust_Left is equal to Y,
and therefore Gender is a good predictor where Marital Status is Divorced. The
Number_Children is equal to 0, 1, and 2, and therefore Number_Children is not a good
predictor.
Decision Tree for Divorced Branch
Figure 4-2 shows the results of the preceding query for the example business
opportunity.
For Divorced, when Cust_Left is equal to Y, the number of records is 5 for Gender
equal to Male. Gender best discriminates the goal when Marital Status is equal to
Divorced.
Computing Cross Tables When Marital Status Equal to Single
This query generates cross tables for all attributes, except Marital Status, compared to
the goal when Marital Status is equal to Single:
SELECT Independent_Variable, IV1, IV2, cust_left, COUNT(*)
FROM miningview
WHERE marital_status = 'Single'
TRANSPOSE ('GENDER', gender, NULL),
('NUMBER_CHILDREN', NULL, number_children)
AS (Independent_Variable, IV1, IV2)
GROUP BY Independent_Variable, IV1, IV2, cust_left
ORDER BY Independent_Variable, IV1, IV2, cust_left;
INDEPENDENT_VARIABLE IV1 IV2 CUST_LEFT (EXPR)
Figure 4-2. Decision Tree for Divorced Branch
Marital Status
Single
No Yes
0 3
Married
No Yes
2 1
Widow
No Yes
1 1
Divorced
No Yes
1 5
Male
No Yes
0 5
Female
No Yes
1 0