Database Reference
In-Depth Information
In general, this log-likelihood ratio test is particularly useful for forward and
backward step-wise methods to add variables to or remove them from the
proposed logistic regression model.
Receiver Operating Characteristic (ROC) Curve
Logistic regression is often used as a classifier to assign class labels to a person,
item, or transaction based on the predicted probability provided by the model. In
the Churn example, a customer can be classified with the label called
Churn
if the
logistic model predicts a high probability that the customer will churn. Otherwise,
a
Remain
label is assigned to the customer. Commonly, 0.5 is used as the default
probability threshold to distinguish between any two class labels. However, any
threshold value can be used depending on the preference to avoid false positives
(for example, to predict
Churn
when actually the customer will
Remain
) or false
negatives (for example, to predict
Remain
when the customer will actually
Churn
).
In general, for two class labels, C and ¬C, where “¬C” denotes “not C,” some
working definitions and formulas follow:
•
True Positive:
predict C, when actually C
•
True Negative:
predict ¬C, when actually ¬C
•
False Positive:
predict C, when actually ¬C
•
False Negative:
predict ¬C, when actually C
6.16
False Positive Rate (FPR)
6.17
True Positive : Rate (TPR)
The plot of the True Positive Rate (TPR) against the False Positive Rate (FPR)
is known as the
Receiver Operating Characteristic (ROC)
curve. Using the
ROCR
package, the following R commands generate the ROC curve for the Churn
example:
library(ROCR)
pred = predict(Churn_logistic3, type="response")
predObj = prediction(pred, churn_input$Churned )
rocObj = performance(predObj, measure="tpr",