Database Reference
In-Depth Information
Then the following R code indicates that the corresponding TPR and FPR values
are 0.91 and 0.29, respectively. Thus, 91% of the customers who will churn are
properly identified, but at a cost of misclassifying 29% of the customers who will
remain.
i <- which(round(alpha,2) == .15)
paste("Threshold=" , (alpha[i]) , " TPR=" , tpr[i] , "
FPR=" , fpr[i])
[1] "Threshold= 0.1543 TPR= 0.9116 FPR= 0.2869"
[2] "Threshold= 0.1518 TPR= 0.9122 FPR= 0.2875"
[3] "Threshold= 0.1479 TPR= 0.9145 FPR= 0.2942"
[4] "Threshold= 0.1455 TPR= 0.9174 FPR= 0.2981"
The ROC curve is useful for evaluating other classifiers and will be utilized again in
Chapter 7, “Advanced Analytical Theory and Methods: Classification.”
Histogram of the Probabilities
It can be useful to visualize the observed responses against the estimated
probabilities provided by the logistic regression. Figure 6.17 provides overlaying
histograms for the customers who churned and for the customers who remained
as customers. With a proper fitting logistic model, the customers who remained
tend to have a low probability of churning. Conversely, the customers who churned
have a high probability of churning again. This histogram plot helps visualize the
number of items to be properly classified or misclassified. In the Churn example,
an ideal histogram plot would have the remaining customers grouped at the left
side of the plot, the customers who churned at the right side of the plot, and no
overlap of these two groups.
Search WWH ::




Custom Search