confidence (confidence associated with the model's prediction). Typically
confidence values are either represented as a value between 0 and 1 or
as a percentage value between 0 and 100; 0 being the lowest confidence
and 1 being the highest confidence. Unlike classification that can pro-
duce multiple-target values and associated probabilities, regression
produces a single target value and associated confidence because tar-
get is a continuous value.
Problem Definition: How to Find Important
ABCBank has collected hundreds of attributes of its customers, and
the user wants to understand which attributes most greatly affect cus-
tomer attrition. Using ranking of attributes by importance, the user can
recommend that high ranking attributes be cleaned more carefully.
The user may also select a top n subset of these attributes to include in
model building. This might not only reduce the time required to build
a model and score, but also might improve model accuracy.
Solution Approach: Rank Attributes According to
JDM defines the attribute importance function that can measure the
predictive power of each attribute in predicting a target and produces
a list of attributes ranked by their relative importance. Using this
function, analysts can select the attributes that are important to pre-
dicting attrition. As noted above, the attribute importance function
helps to automate the selection of attributes for predicting target
Data Specification, Fine-Tune Settings, and Algorithm
We use the same dataset as discussed in Section 7.1.3 for the classifi-
cation problem. The data specification for attribute importance is the
same as for classification.
JDM does not specify any algorithm settings for attribute impor-
tance. However, several algorithms can be used to support this