Using the Data Mining Add-In for Microsoft Office - Microsoft Business Intelligence Tools for Excel Analysts

Database Reference

In-Depth Information

The values are unique to each type of scenario being analyzed, but generally, there are two cost

types and two profit types:

➤ False positive (FP): A prediction that targeting this customer would lead to a bicycle pur-

chase was false (a cost was gained by targeting a customer that didn't lead to a sale).

➤ False negative (FN): A prediction that targeting this customer would not lead to a bicycle

purchase was false (an opportunity cost was gained because the customer would have pur-

chased a bicycle had he been targeted).

➤ True positive (TP): A prediction that targeting this customer would lead to a bicycle pur-

chase was true (a sale was made with a realized gain).

➤ True negative (TN): A prediction that targeting this customer would not lead to a bicycle

purchase was true (the cost of targeting was avoided).

You can update the input values for the Prediction Calculator, which leads to instant updates in the

maximum profit chart and suggested threshold. In Figure 14-15, the profits and costs in the

Prediction Calculator report as follows: False Positive Cost = $10, False Negative Cost = $0, True

Positive Profit = $10, True Negative Profit = $0. These inputs give you a Suggested Threshold to

Maximize Profit of 512.

Although there may be significant variability in the input data that is used to build the

model, such as some products that produce very high profits and some that produce

very low profits, the profits and costs are global factors. This means that a true positive

(that is, a sale is predicted and you did make the sale) carries the same value regardless

of product cost. To get around this, you can limit the input data to similar cases so that

there is more consistency in terms of matching the global profits and costs to the indi-

vidual cases.

Note

Score Breakdown

Directly below the cost and profit inputs, the Score Breakdown section uses a point system and

shaded bars to show the relative impact of each input column (and input column state) in terms of its

tendency to lead to the target predicted column state, which in this case is Purchase Bike =

Yes . By sorting the Relative Impact column by Largest to Smallest, you can see the strongest predic-

tors of purchasing a bike: Children = 3 , Cars = 0 , and so on. Note that these predictive power

scores are based on the underlying regression model and do not change when you change the cost

and profit inputs.

Data table

Several dozen rows below the Score Breakdown is a data table that is used as an input for the remain-

ing charts to be discussed in the next two sections of this chapter. This data table is automatically

built for purposes of simulating a reasonable number of test cases. Each test case (each row in the

table) has a number of prediction outcome states (remember that there are four prediction outcome

Search WWH ::

Custom Search

Home