Database Reference
In-Depth Information
Sarah would also be wise to remember that the CRISP-DM approach is cyclical in nature. Each
month as new orders come in and new bills go out, as new customers sign up for a heating oil
account, there are additional data available to add into the model. As she learns more about how
each attribute in her data set interacts with others, she can increase our correlation model by
adding not only new attributes, but also, new observations.
CHAPTER SUMMARY
This chapter has introduced the concept of correlation as a data mining model. It has been chosen
as the first model for this topic because it is relatively simple to construct, run and interpret, thus
serving as an easy starting point upon which to build. Future models will become more complex,
but continuing to develop your skills in RapidMiner and getting comfortable with the tools will
make the more complex models easier for you to achieve as we move forward.
Recall from Chapter 1 (Figure 1-2) that data mining has two somewhat interconnected sides:
Classification, and Prediction. Correlation has been shown to be primarily on the side of
Classification. We do not infer causation using correlation metrics, nor do we use correlation
coefficients to predict one attribute's value based on another's. We can however quickly find
general trends in data sets using correlations, and we can anticipate how strongly an observed
movement in one attribute will occur in conjunction with movement in another.
Correlation can be a quick and easy way to see how elements of a given problem may be
interacting with one another. Whenever you find yourself asking how certain factors in a problem
you're trying to solve interact with one another, consider building a correlation matrix to find out.
For example, does customer satisfaction change based on time of year? Does the amount of
rainfall change the price of a crop? Does household income influence which restaurants a person
patronizes? The answer to each of these questions is probably 'yes', but correlation can not only
help us know if that's true, it can also help us learn how strongly the interactions are when, and if,
they occur.
 
Search WWH ::




Custom Search