Cluster Analysis - Visual Data Mining: The VisMiner Approach

Databases Reference

In-Depth Information

7

Cluster Analysis

Introduction

Cluster analysis is the process of grouping observations based on similarity

(visually observed as proximity), connectedness, or density. The results of a

cluster analysis are called a clustering.

Cluster analysis is similar in concept to the previously discussed process of

classification. In classification, the observation groupings (classifications) are

known a priori. The objective of classification analysis is to discover relation-

ships between other dataset attributes and the previously known class attribute

that could be used to predict class membership. However, in cluster analysis the

groupings are not previously known. The objective is the discovery of clusters

of observations grouped according to dataset attribute values.

In data mining, there are a number of potential objectives in conducting a

cluster analysis.

Sub-population identification and isolation. As has been discussed in previous

chapters, datasets may be composed of observations drawn from populations

with different characteristics. Relationships found only in a single subset may

not be as readily identified when exploring the full dataset versus just the

subset. Hence, a good rule of thumb is to isolate the subsets and then analyze

individually. A strategy in product marketing is to first segment the market,

then develop specific promotions for selected market segments. The same

principlemay be applied to datamining - isolate subsets, then develop custom

analysis plans for each.

Search WWH ::

Custom Search

Home