Java Reference
In-Depth Information
likely, or unlikely, to attrite. Using the data mining
classification
function, ABCBank can predict customers who are likely to attrite
and understand the characteristics, or
profiles
, of such customers.
Gaining a better understanding of customer behavior enables
ABCBank to develop business plans to retain customers.
Classification is used to assign cases, such as customers, to discrete
values, called
classes
or
categories
, of the target attribute. The
target
is
the attribute whose values are predicted using data mining. In this
problem, the target is the attribute
attrite
with two possible values:
Attriter
and
Non-attriter
. When referring to the model build dataset,
the value
Attriter
indicates that the customer closed all accounts, and
Non-attriter
indicates the customer has at least one account at
ABCBank. When referring to the prediction in the model apply
dataset, the value
Attriter
indicates that the customer is
likely
to
attrite and
Non-attriter
indicates that the customer is
not likely
to
attrite. The prediction is often associated with a probability indicat-
ing how likely the customer is to attrite. When a target attribute has
only two possible values, the problem is referred to as a
binary classi-
fication
problem. When a target attribute has more than two possible
values, the problem is known as a
multiclass
classification
problem.
7.1.3
Data Specification: CUSTOMERS Dataset
As noted in Chapter 3, an important step in any data mining project
is to collect related data from enterprise data sources. Identifying
which attributes should be used for data mining is one of the chal-
lenges faced by the data miner and relies on appropriate domain
knowledge of the data. In this example, we introduce a subset of pos-
sible customer attributes as listed in Table 7-1. In real-world scenar-
ios, there may be hundreds or even thousands of customer attributes
available in enterprise databases.
Table 7-1 lists
physical
attribute
details of the CUSTOMERS dataset,
which include
name
,
data type
, and
description
. The attribute name
refers to either a column name of a database table or a field name of a
flat file. The attribute data type refers to the allowed type of values
for that attribute. JDM defines
integer
,
double
, and
string
data types,
which are commonly used data types for mining. JDM conformance
rules allow a vendor to add more data types if required. Attribute
description can be used to explain the meaning of the attribute or
describe the allowed values. In general, physical data characteristics
are captured by database metadata.
Search WWH ::
Custom Search