Logistic Regression - Data Mining for the Masses

Database Reference

In-Depth Information

Figure 9-2. The training and scoring data sets in a

new main process window in RapidMiner.

5) Run the model and compare the ranges for all attributes between the scoring and training

result set tabs (Figures 9-3 and 9-4, respectively). You should find that the ranges are the

same. As was the case with Linear Regression, the scoring values must all fall within the

lower and upper bounds set by the corresponding values in the training data set. We can

see in Figures 9-3 and 9-4 that this is the case, so our data are very clean, they were

prepared during extraction from Sonia's source database, and we will not need to do

further data preparation in order to filter out observations with inconsistent values or

modify missing values.

Figure 9-3. Meta data for the scoring data set

(note absence of 2nd_Heart_Attack attrtibute).

Search WWH ::

Custom Search

Home