Information Technology Reference
In-Depth Information
often defined by means of a distance norm that is measured among the data vectors
themselves, or as a distance from a data vector to some prototypical object or
center of the cluster. The cluster centers are usually not known beforehand and are,
therefore, determined simultaneously by the clustering algorithm while partitioning
the data. The prototypes may be a vector of the same dimension as the data objects,
and they can also be defined as geometrical objects, such as linear or nonlinear
subspaces or functions. Data can reveal clusters of different geometrical shapes,
sizes, and densities, such as spherical, ellipsoid, or as linear and nonlinear
subspaces of data space.
Various clustering algorithms have been proposed in the literature, and these
can be classified according to whether the clusters - seen as subsets of the entire
data set - are fuzzy or crisp. Clustering algorithms, based on classical set theory,
classify the individual objects according to their belonging or not belonging to a
cluster, which is known as hard clustering . Here, the partitioning of data is such
that any particular object can be a member of only one particular subset of data or
of a particular cluster.
Fuzzy clustering algorithms, however, allow the objects to belong to several
clusters simultaneously, but with different degrees of membership, which in many
situations is more natural than hard clustering. For instance, in this case the objects
on the boundaries between several clusters are not forced to belong fully to one of
the classes, but rather are assigned membership degrees between 0 and 1,
indicating their partial membership.
On the other hand, the discrete nature of hard partitioning also causes
difficulties with algorithms based on analytic functionals, since these functionals
are not differentiable. Clustering algorithms may use an objective function to
measure the desirability of partitions. Nonlinear optimization algorithms are used
to search for local optima of the objective function. The concept of fuzzy partition
is essential for cluster analysis, and consequently also for the identification
techniques based on fuzzy clustering.
4.7.1.2 Hard Partition
A hard partition can be considered as a group of subsets formulated in terms of
classical sets. The objective of hard clustering is to partition the given data set
" into c clusters, also called groups or classes. We initially
assume that the number of clusters, i.e. c is known a priori, based on some prior
knowledge about the dynamics of the system that generated the data set Z . Using
classical sets, a hard partition of Z can be defined as a family of subsets
^
Zzz z
{, , , }
N
12
`
A
1
dd with the following properties (Bezdek, 1981):
gc
c
AZ
,
*
g
g
1
A
d z d
dd
A
0,
1
g
h
c
,
(4.17)
g
h
0
AZ
,
1
gc
.
g
Search WWH ::




Custom Search