Database Reference
In-Depth Information
CHAPTER FOUR:
CORRELATION
CONTEXT AND PERSPECTIVE
Sarah is a regional sales manager for a nationwide supplier of fossil fuels for home heating. Recent
volatility in market prices for heating oil specifically, coupled with wide variability in the size of
each order for home heating oil, has Sarah concerned. She feels a need to understand the types of
behaviors and other factors that may influence the demand for heating oil in the domestic market.
What factors are related to heating oil usage, and how might she use a knowledge of such factors
to better manage her inventory, and anticipate demand? Sarah believes that data mining can help
her begin to formulate an understanding of these factors and interactions.
LEARNING OBJECTIVES
After completing the reading and exercises in this chapter, you should be able to:
Explain what correlation is, and what it isn't.
Recognize the necessary format for data in order to perform correlation analysis.
Develop a correlation model in RapidMiner.
Interpret the coefficients in a correlation matrix and explain their significance, if any.
ORGANIZATIONAL UNDERSTANDING
Sarah's goal is to better understand how her company can succeed in the home heating oil market.
She recognizes that there are many factors that influence heating oil consumption, and believes
that by investigating the relationship between a number of those factors, she will be able to better
monitor and respond to heating oil demand. She has selected correlation as a way to model the
relationship between the factors she wishes to investigate. Correlation is a statistical measure of
how strong the relationships are between attributes in a data set.
59
 
 
 
 
Search WWH ::




Custom Search