Overview of Data Mining - Java Data Mining: Strategy, Standard, and Practice

Java Reference

In-Depth Information

used extensively in industry. For example, credit card transactions

and mortgage applications are often approved with input from data

mining models.

When a model is first produced, it can be quite reliable in terms of

the accuracy of its predictions on new data. However, is the predictive

quality of a model invariant? Does model accuracy remain constant

over time?

Few things remain constant, especially when humans are

involved. Tastes change, needs change, technology changes, life-

altering events force change. For example, a model that may have

been excellent at predicting credit risk for a given month may start to

show signs of degraded performance. When this happens we say

that such a model is stale . In this case, the model may need to be

rebuilt, taking into account more recent data. The data mining

process and its artifacts require periodic review and maintenance to

maintain reliable results.

1.3.2

How Can Data Mining Increase Profits and Reduce Costs?

Let's consider an example from campaign management, first without

the use of data mining and then using data mining. One of the objec-

tives for campaign management is to determine which customers to

contact with regard to a particular sales campaign, with goals to min-

imize costs and maximize response and profits. If you knew in

advance which customers would respond, you may likely contact

only those customers.

Consider Company DMWHIZZ with a base of a million customers.

Based on previous campaign responses, DMWHIZZ generally gets a

2 percent response rate. With a million customers, this produces

about 20,000 responses. A proposed DMWHIZZ campaign will

require mailing costs of $1.50 per item, with a total campaign cost of

$1.5 million. If the average profit per customer who responds is $50,

our expected total profit is $1 million (20,000

$50). But, since the

net profit of the campaign is a negative $500,000, DMWHIZZ will not

proceed with the campaign.

Let's see how applying data mining can make this campaign

profitable. Selecting those customers most likely to respond is a

classification problem (i.e., classify each customer as responding or

not with an associated probability). As with any classification prob-

lem, DMWHIZZ will need to have actual response data from a

similar campaign to learn customer behavior. To achieve this,

Search WWH ::

Custom Search

Home