Databases Reference
In-Depth Information
11
AdvancedClusterAnalysis
Youlearnedthefundamentals of cluster analysis in Chapter 10. In this chapter, we discuss
advanced topics of cluster analysis. Specifically, we investigate four major perspectives:
Probabilistic model-based clustering : Section 11.1 introduces a general framework
and a method for deriving clusters where each object is assigned a probability of
belonging to a cluster. Probabilistic model-based clustering is widely used in many
data mining applications such as text mining.
Clustering high-dimensional data : When the dimensionality is high, conventional
distance measures can be dominated by noise. Section 11.2 introduces fundamental
methods for cluster analysis on high-dimensional data.
Clustering graph and network data : Graph and network data are increasingly pop-
ular in applications such as online social networks, the World Wide Web, and digital
libraries. In Section 11.3, you will study the key issues in clustering graph and
network data, including similarity measurement and clustering methods.
Clustering with constraints : In our discussion so far, we do not assume any con-
straints in clustering. In some applications, however, various constraints may exist.
These constraints may rise from background knowledge or spatial distribution of
the objects. You will learn how to conduct cluster analysis with different kinds of
constraints in Section 11.4.
By the end of this chapter, you will have a good grasp of the issues and techniques
regarding advanced cluster analysis.
11.1 ProbabilisticModel-BasedClustering
In all the cluster analysis methods we have discussed so far, each data object can be
assigned to only one of a number of clusters. This cluster assignment rule is required
in some applications such as assigning customers to marketing managers. However,
 
Search WWH ::




Custom Search