Decision Trees - Data Mining for the Masses

Database Reference

In-Depth Information

Now, re-run the model and we will move on to…

EVALUATION

Figure 10-12. Tree resulting from a gini_index algorithm.

We see in this tree that there is much more detail, more granularity in using the Gini algorithm as

our parameter for our decision tree. We could further modify the tree by going back to design

view and changing the minimum number of items to form a node (size for split) or the minimum

size for a leaf. Even accepting the defaults for those parameters though, we can see that the Gini

algorithm alone is much more sensitive than is the Gain Ratio algorithm in identifying nodes and

leaves. Take a minute to explore around this new tree model. You will find that it is extensive,

and that you will to use both the Zoom and Mode tools to see it all. You should find that most of

our other independent variables (predictor attributes) are now being used, and the granularity with

which Richard can identify each customer's likely adoption category is much greater. How active

the person is on Richard's employer's web site is still the single best predictor, but gender, and

multiple levels of age have now also come into play. You will also find that a single attribute is

sometimes used more than once in a single branch of the tree. Decision trees are a lot of fun to

experiment with, and with a sensitive algorithm like Gini generating them, they can be

tremendously interesting as well.

Search WWH ::

Custom Search

Home