Database Reference
In-Depth Information
Figure 3.20 Using a scatterplot to examine the structure of the clusters.
Although all the above evaluation measures make mathematical sense and
provide technical help to identify the optimal solution, data miners should not
solely base their decision on them. A clustering solution is justified only if it
makes sense from a business point of view. Actionability, potential business value,
interpretability, and ease of use are factors that are hard to quantify and, in a way,
measure subjectively. However, they are the best benchmarks for determining the
optimal clustering solution. The profiling of the clusters and identification of their
defining characteristics are essential parts of the clustering procedure. It should
not be considered as a post-analysis task but rather as an essential step for assessing
the effectiveness of the solution; that is why we have dedicated a whole section of
this chapter to these topics.
As with every data mining model, data miners should try many different
clustering techniques and compare the similarity of the derived solutions before
deciding which one to choose. Different techniques that generate analogous
results are a good sign for identifying a general and valid solution. As in any
other data mining model, clustering results should also be validated by apply-
ing the model in a disjoint dataset and by examining the consistency of the
results.
Search WWH ::




Custom Search