The Mining Data Mart - Data Mining Techniques in CRM: Inside Customer Segmentation

Database Reference

In-Depth Information

In SQL this is expressed as: update

<

table

>

set

<

field

> = <

sub query or

>

• Restructure: Pivots data and generates multiple fields based on the values

of another field. Restructure is usually followed by aggregation. For instance,

restructure can be applied when we have groups of related records (e.g., multiple

records denoting the transaction amount and the transaction channel for each

customer) that we want to rearrange so that the original multiple records are

pivoted into a single record with multiple separate fields: that is, one record for

each customer with multiple fields, denoting the relative sum for each distinct

transaction channel. In SQL this is expressed as: case when

formula

<

formula

>

then

end

• Derive/compute: Modifies data values and constructs new fields from existing

ones, based on calculations, logical conditions, and functions. During lengthy

data mining projects, it is common to perform several derivations, such as flags,

ratios, percentages, etc.

<

value1

>

else

<

value2

>

As a simplified example of a typical data processing procedure required for

the construction of the MCIF, let us consider the transaction data in Table 4.12

for two bank customers. The data denote the volume of transactions (Amount_

Transactions) per channel, in a format similar to the data mart's ''M_Transactions''

table presented earlier. The objective is to turn these data into informative

indicators of the channel utilization for each customer.

This information is detailed and informative, yet not immediately suitable for

mining purposes. It has to go through extensive data preparation before revealing

the transactional profile of each customer. The data preparation steps necessary in

order to summarize this information at a customer level for the needs of the MCIF

include:

1. Restructure and aggregation of the transactional data at a customer

level: The multiple records per customer (one record per year/month, account

ID, transaction code, and transaction channel) are pivoted and aggregated so

that each customer is represented by a single record and a set of relevant fields

which indicate the total volume for each transaction channel. We assume here

that no differentiation in terms of product code or time period is necessary for

the needs of our analysis. The intermediate table which is produced through

restructure and aggregation is given in Table 4.13.

The IBM SPSS Modeler Restructure node used for pivoting the data is

displayed in Figure 4.2.

2. Derive new fields to enrich the original information: After the data have

been pivoted and aggregated at a customer level, the next step is to enrich the

Data Mining Techniques in CRM: Inside Customer Segmentation

Search WWH ::

Custom Search

Home