Database Reference
In-Depth Information
customers' activity on the company's web site, he can anticipate approximately when each person
will be most likely to buy an eReader. He feels like data mining can help him figure out which
activities are the best predictors of which category a customer will fall into. Knowing this, he can
time his marketing to each customer to coincide with their likelihood of buying.
DATA UNDERSTANDING
Richard has engaged us to help him with his project. We have decided to use a decision tree
model in order to find good early predictors of buying behavior. Because Richard's company does
all of its business through its web site, there is a rich data set of information for each customer,
including items they have just browsed for, and those they have actually purchased. He has
prepared two data sets for us to use. The training data set contains the web site activities of
customers who bought the company's previous generation reader, and the timing with which they
bought their reader. The second is comprised of attributes of current customers which Richard
hopes will buy the new eReader. He hopes to figure out which category of adopter each person in
the scoring data set will fall into based on the profiles and buying timing of those people in the
training data set.
In analyzing his data set, Richard has found that customers' activity in the areas of digital media
and books, and their general activity with electronics for sale on his company's site, seem to have a
lot in common with when a person buys an eReader. With this in mind, we have worked with
Richard to compile data sets comprised of the following attributes:
User_ID : A numeric, unique identifier assigned to each person who has an account on
the company's web site.
Gender : The customer's gender, as identified in their customer account. In this data set, it
is recorded a 'M' for male and 'F' for Female. The Decision Tree operator can handle non-
numeric data types.
Age : The person's age at the time the data were extracted from the web site's database.
This is calculated to the nearest year by taking the difference between the system date and
the person's birthdate as recorded in their account.
Marital_Status : The person's marital status as recorded in their account. People who
indicated on their account that they are married are entered in the data set as 'M'. Since the
 
Search WWH ::




Custom Search