Database Reference
In-Depth Information
CustomerID) will not be unique and thus cannot be the primary key. Note that SALES_
FOR_RFM links to the same CUSTOMER and TIMELINE dimension tables as PRODUCT_
SALES. This is done to maintain consistency within the data warehouse, and when a
dimension table links to two or more fact tables, it is called a conformed dimension .
Why would we add a fact table named SALES_FOR_RFM? To explain that, we need to
discuss reporting systems.
Reporting Systems
The purpose of a reporting system is to create meaningful information from disparate data
sources and to deliver that information to the proper users on a timely basis. As stated earlier,
reporting systems differ from data mining because they create information using the simple
operations of sorting, filtering, grouping, and making simple calculations. We begin this sec-
tion with a description of a typical reporting problem: RFM analysis.
RFM Analysis
RFM analysis is a way of analyzing and ranking customers according to their purchasing pat-
terns. It is a simple technique that considers how recently ( R score ) a customer ordered, how
frequently ( F score ) a customer orders, and how much money ( M score ) the customer spends
per order. RFM is summarized in Figure 12-19.
To produce an RFM score, we need only two things: customer data and sales data for each
purchase (the date of the sale and the total amount of the sale) made by each customer. If you
look at the SALES_FOR_RFM table and its associated CUSTOMER and TIMELINE dimension
tables in Figure 12-18, you see that we have exactly those data: The SALES_FOR_RFM table is
the starting point for RFM analysis in the HSD-DW BI system.
To produce an RFM score, customer purchase records are first sorted by the date of their
most recent (R) purchase. In a common form of this analysis, the customers are divided into
five groups, and a score of 1 to 5 is given to customers in each group. Thus, the 20 percent of
the customers having the most recent orders are given an R score of 1, the 20 percent of the
customers having the next most recent orders are given an R score of 2, and so forth, down to
the last 20 percent, who are given an R score of 5.
The customers are then resorted on the basis of how frequently they order. The 20 percent
of the customers who order most frequently are given an F score of 1, the next 20 percent most
frequently ordering customers are given a score of 2, and so forth, down to the least frequently
ordering customers, who are given an F score of 5.
Finally, the customers are sorted again according to the amount of their orders. The
20  percent who have ordered the most expensive items are given an M score of 1, the next
20 percent are given an M score of 2, and so forth, down to the 20 percent who spend the least,
who are given an M score of 5.
Figure 12-20 shows sample RFM data for Heather Sweeney Designs. (Note that these
data have not been calculated and are for illustrative purposes only.) The first customer, Ralph
Able, has a score of {1 1 2}, which means that he has ordered recently and orders frequently.
His M score of 2 indicates, however, that he does not order the most expensive goods. From
these scores, the salespeople can surmise that Ralph is a good customer but that they should
attempt to up-sell Ralph to more expensive goods.
Figure 12-19
RFM Analysis
• Simple report-based customer classification scheme
• Score customers on recentness, frequency, and
monetary size of orders
• Typically, divide each criterion into 5 groups and
score from 1 to 5
 
 
Search WWH ::




Custom Search