used extensively in industry. For example, credit card transactions
and mortgage applications are often approved with input from data
When a model is first produced, it can be quite reliable in terms of
the accuracy of its predictions on new data. However, is the predictive
quality of a model invariant? Does model accuracy remain constant
Few things remain constant, especially when humans are
involved. Tastes change, needs change, technology changes, life-
altering events force change. For example, a model that may have
been excellent at predicting credit risk for a given month may start to
show signs of degraded performance. When this happens we say
that such a model is stale . In this case, the model may need to be
rebuilt, taking into account more recent data. The data mining
process and its artifacts require periodic review and maintenance to
maintain reliable results.
How Can Data Mining Increase Profits and Reduce Costs?
Let's consider an example from campaign management, first without
the use of data mining and then using data mining. One of the objec-
tives for campaign management is to determine which customers to
contact with regard to a particular sales campaign, with goals to min-
imize costs and maximize response and profits. If you knew in
advance which customers would respond, you may likely contact
only those customers.
Consider Company DMWHIZZ with a base of a million customers.
Based on previous campaign responses, DMWHIZZ generally gets a
2 percent response rate. With a million customers, this produces
about 20,000 responses. A proposed DMWHIZZ campaign will
require mailing costs of $1.50 per item, with a total campaign cost of
$1.5 million. If the average profit per customer who responds is $50,
our expected total profit is $1 million (20,000
$50). But, since the
net profit of the campaign is a negative $500,000, DMWHIZZ will not
proceed with the campaign.
Let's see how applying data mining can make this campaign
profitable. Selecting those customers most likely to respond is a
classification problem (i.e., classify each customer as responding or
not with an associated probability). As with any classification prob-
lem, DMWHIZZ will need to have actual response data from a
similar campaign to learn customer behavior. To achieve this,