Using the JDM API - Java Data Mining: Strategy, Standard, and Practice

Java Reference

In-Depth Information

9.4.4

Test Metrics for Model Evaluation

This section introduces interfaces and methods used to compute

and retrieve classification test metrics using JDM. Table 9-14 lists

the classification test metrics-related interfaces. JDM provides two

types of tasks for computing test metrics for supervised functions:

supervised.TestTask and supervised.TestMetricsTask . The TestTask

requires a supervised model and test data, whereas the TestMetric-

sTask interface uses apply output data that includes actual and pre-

dicted target values. In this example, we use ClassificationTestTask

and illustrate how the TestMetricsTask interface is used for the

regression example shown later in Section 9.5.

Listing 9-11 shows the code that extends the CustomerAttrition

class with the attrition_model evaluation. Recall that section 7.1.6

illustrates classification test metrics such as accuracy, error, confu-

sion matrix, lift, and receiver operating characteristics (ROC). Listing

9-11 shows the computation and retrieval of the classification test

metrics. Lines 15 to 22 show the creation and execution of the attri-

tionTestTask that computes the test metrics of the attrition_model

using the CUSTOMERS_TEST_DATA. Lines 27 to 29 show the

retrieval of the attrition_test_metrics object that was created by the test

Table 9-14

Classification test metrics-related interfaces

javax.datamining.supervised.classification package

ClassificationTestTask

A ClassificationTestTask is used for testing a classification model to

measure the model quality on test data.

ClassificationTestMetrics

A ClassificationTestMetrics encapsulates classification test metrics

such as confusion matrix, lift, and ROC. It provides get methods to

retrieve these metrics.

ConfusionMatrix

A ConfusionMatrix specifies the statistics of the correct predictions

and errors.

Lift

A Lift specifies the results of the lift computation. It contains the

lift, target denisity details for each quantile. Using this object, one

can plot the lift charts that are described in Chapter 7.

ReceiverOperatingCharacteristics

A ReceiverOperatingCharacteristics specifies the result of receiver

operating characteristic computation. It contains the false and true

positive rates at various probability thresholds. Using this object,

one can plot the ROC charts described in Chapter 7.

ClassificationTestMetricsTask

A ClassificationTestMetricsTask is a mining task used to compute

test metrics given an apply output data.

Java Data Mining: Strategy, Standard, and Practice

Search WWH ::

Custom Search

Home