Databases Reference
In-Depth Information
Model performance
Most measures of model performance may be computed using any of the three
applicable datasets - training, validation, and test. These sets can also be used to
compare actual outputs to predicted outputs.
For the test set only, the performancemeasures are not automatically computed.
The dataset must first be applied to the model. (Drag and drop test dataset on
model, then choose “Test model performance”.)
Classification
1. Classification error rate
a. Compare error rate of model to baseline error rate which is one minus
the rate of the most frequently occurring class. For example, if the most
frequently occurring class is found in 52% of the observations, then a
model prediction error rate of 40% would be an improvement over the
baseline error rate of 48%. However, if the rate of the most frequently
occurring class is 95%, then a model error rate of 10% would be worse
than the baseline error rate of 5%.
2. View model error rates using the confusion viewer, the ROC viewer, and
the class model viewer.
3. False positive and false negative error rates - available in the confusion
viewer. Depending on the intended model application, the costs of the
different types of errors may be quite different. If one error type is more
costly than another, focus on that type of error.
4. Area under curve (AUC) - available in ROC curve viewer. Maximum
AUC is 1.0. The closer to 1.0 the better.
5. Model lift - available within ROC curve viewer. Represents error rate
found when only the top n% of the observations are chosen.
6. Model applications costs - available within ROC curve viewer. Allows
user to apply monetary costs to compute benefits of model with respect to
false positive and false negative errors.
Regression
1. R 2 - measure of regression fitness. Any value greater than zero is an
improvement on the baseline model (output attribute mean). Available
in the regression model viewer and the regression summary where
applicable.
2. F-statistic and P-value - statistical measures of goodness-of-fit with
respect to the regression as a whole and to input coefficients. Available
for linear and polynomial regressions only via regression summary.
Search WWH ::




Custom Search