Database Reference
In-Depth Information
CHAPTER SIX:
K-MEANS CLUSTERING
CONTEXT AND PERSPECTIVE
Sonia is a program director for a major health insurance provider. Recently she has been reading
in medical journals and other articles, and found a strong emphasis on the influence of weight,
gender and cholesterol on the development of coronary heart disease. The research she's read
confirms time after time that there is a connection between these three variables, and while there is
little that can be done about one's gender, there are certainly life choices that can be made to alter
one's cholesterol and weight. She begins brainstorming ideas for her company to offer weight and
cholesterol management programs to individuals who receive health insurance through her
employer. As she considers where her efforts might be most effective, she finds herself wondering
if there are natural groups of individuals who are most at risk for high weight and high cholesterol,
and if there are such groups, where the natural dividing lines between the groups occur.
LEARNING OBJECTIVES
After completing the reading and exercises in this chapter, you should be able to:
Explain what k-means clusters are, how they are found and the benefits of using them.
Recognize the necessary format for data in order to create k-means clusters.
Develop a k-means cluster data mining model in RapidMiner.
Interpret the clusters generated by a k-means model and explain their significance, if any.
ORGANIZATIONAL UNDERSTANDING
Sonia's goal is to identify and then try to reach out to individuals insured by her employer who are
at high risk for coronary heart disease because of their weight and/or high cholesterol. She
understands that those at low risk, that is, those with low weight and cholesterol, are unlikely to
91
 
 
 
 
Search WWH ::




Custom Search