Database Reference
In-Depth Information
Interpret and discuss the results of cross-validation matrix.
CROSS-VALIDATION
Cross-validation is the process of checking for the likelihood of false positives in predictive
models in RapidMiner. Most data mining software products will have operators for cross-
validation and for other forms of false positive detection. A false positive is when a value is
predicted incorrectly. We will give one example here, using the decision tree we built for our
hypothetical client Richard, back in Chapter 10. Complete the following steps:
1) Open RapidMiner and start a new, blank data mining process.
2) Go to the Repositories tab and locate the Chapter 10 training data set. This was the one
that had attributes regarding peoples' buying habits on Richard's employer's web site, along
with their category of eReader adoption. Drag this data set into your main process
window. You can rename it if you would like. In Figure 13-1, we have renamed it eReader
Train.
Figure 13-1. Adding the Chapter 10 training data to a new model in order to
cross-validate its predictive capabilities.
3) Add a Set Role operator to the stream. We'll learn a new trick here with this operator. Set
the User_ID attribute to be 'id'. We know we still need to set eReader_Adoption to be
 
Search WWH ::




Custom Search