Database Reference
In-Depth Information
Figure 7.2
Example of a decision stump
To illustrate how a decision tree works, consider the case of a bank that wants
to market its term deposit products (such as Certificates of Deposit) to the
appropriate customers. Given the demographics of clients and their reactions to
previous campaign phone calls, the bank's goal is to predict which clients would
subscribe to a term deposit. The dataset used here is based on the original dataset
collected from a Portuguese bank on directed marketing campaigns as stated in
the work by Moro et al. [6].
Figure 7.3
shows a subset of the modified bank
marketing dataset. This dataset includes 2,000 instances randomly drawn from
the original dataset, and each instance corresponds to a customer. To make the
example simple, the subset only keeps the following categorical variables: (1)
job
,
(2)
marital
status, (3)
education
level, (4) if the credit is in
default
, (5) if
there is a
housing
loan, (6) if the customer currently has a personal
loan
, (7)
contact
type, (8) result of the previous marketing campaign contact (
poutcome
),
and finally (9) if the client actually
subscribed
to the term deposit. Attributes (1)
through (8) are input variables, and (9) is considered the outcome. The outcome
subscribed
is either
yes
(meaning the customer will subscribe to the term
deposit) or
no
(meaning the customer won't subscribe). All the variables listed
earlier are categorical.