Classification Models in VisMiner - Visual Data Mining: The VisMiner Approach

Databases Reference

In-Depth Information

plot. By dragging the green slider to the left or right the user is able to adjust the

probability cut-off at which a prediction is triggered.

Drag the cut-off slider for the confusion matrix of the tree classifier down

to 30%.

As the cut-off for Prob(N) is dragged to the left, the false positives (wasted

promotions) begin to drop while the false negatives (missed sales) increase. At

this point, you might ask, “At what level should the cut-off be set to eliminate all

false positives?” Or you might ask, “At what level should the cut-off be set to

eliminate all false negatives?”

Drag the cut-off slider to determine if and at what level the false positives

disappear and at what level the false negatives disappear.

Interpreting the ROC Curve

To dig deeper into the performance of the ANN car buyer classifier:

Drag the model up to the display currently containing the confusion matrix

and release on the right side, moving the confusion matrix to the left.

Select “ROC Curve” (Figure 5.11).

The ROC ( receiver operating characteristic ) curve represents the trade-offs

between the undesirable false positive rate (FPR) and the desirable true

positive rate (TPR). That is, how much of an increase in the FPR must be

accepted in order to achieve a desired increase in the TPR?

The ROC relies on the fact that the confidence we have in predictions varies

from observation to observation. Suppose that, instead of predicting a positive

result if the probability of the positive value exceeds the probability of a

negative value, we decide to only predict positive if the probability is 0.90 or

greater. Thus we would expect the FPR to be low (0.10). But what would the

TPR be? If 75% of the positive observations have a positive probability of 0.90

or greater, then the TPR will be high. However, if our classifier is less certain

and only 25% of the positive observations have a probability of 0.90 or greater,

then TPR will be much lower.

The ROC visually represents how confident the classifier is in its predictions.

The ROC of an ideal classifier would go straight up the left axis to the top, then

horizontally across the top to the right, indicating that it can achieve a TPR of

1.0 without including any false positives (FPR ¼ 0).

The closer that a ROC curve is to the ideal, the more confidence we have in its

predictions. A common measure of closeness to the ideal is the area under the

Search WWH ::

Custom Search

Home