Database Reference
In-Depth Information
ORGANIZATIONAL UNDERSTANDING
Sarah's new data mining objective is pretty clear: she wants to anticipate demand for a consumable
product. We will use a linear regression model to help her with her desired predictions. She has
data, 1,218 observations from the Chapter 4 data set that give an attribute profile for each home,
along with those homes' annual heating oil consumption. She wants to use this data set as training
data to predict the usage that 42,650 new clients will bring to her company. She knows that these
new clients' homes are similar in nature to her existing client base, so the existing customers' usage
behavior should serve as a solid gauge for predicting future usage by new customers.
DATA UNDERSTANDING
As a review, our data set from Chapter 4 contains the following attributes:
Insulation : This is a density rating, ranging from one to ten, indicating the thickness of
each home's insulation. A home with a density rating of one is poorly insulated, while a
home with a density of ten has excellent insulation.
Temperature : This is the average outdoor ambient temperature at each home for the
most recent year, measure in degree Fahrenheit.
Heating_Oil : This is the total number of units of heating oil purchased by the owner of
each home in the most recent year.
Num_Occupants : This is the total number of occupants living in each home.
Avg_Age : This is the average age of those occupants.
Home_Size : This is a rating, on a scale of one to eight, of the home's overall size. The
higher the number, the larger the home.
We will use the Chapter 4 data set as our training data set in this chapter. Sarah has assembled a
separate Comma Separated Values file containing all of these same attributes, except of course for
Heating_Oil, for her 42,650 new clients. She has provided this data set to us to use as the scoring
data set in our model.
 
 
Search WWH ::




Custom Search