Databases Reference
In-Depth Information
with centroids 10 and 65. Ok, these might be candidates for “Old” and
“Young”. But, if 60 and 70 are selected as initial cluster centroids, vector
10 will be grouped together with 60 and we end up with two clusters
with centroids 35 and 70 which might be a less optimal definition. The
main advantage of the k -means algorithm is its simplicity and speed, a
good feature if and IDS want to use clustering techniques in real-time.
Also, its complexity increases in a linear matter with an increase in the
number of features used. Other algorithm exists and these too could be
candidates for automatic clustering, like; (i) The Fuzzy C-means algorithm,
(ii) Hierarchical clustering, (iii) Mixture of Gaussians.
Fuzzy c -Means (FCM) algorithm,
also known as fuzzy ISODATA, was introduced by Bezdek 39 as extension
to Dunn's algorithm to generate fuzzy sets for every observed feature.
The fuzzy c -means clustering algorithm is based on the minimization of
an objective function called c -means functional. Fuzzy c -means algorithm
is one of the well known relational clustering algorithms. It partitions the
sample data for each explanatory (input) variable into a number of clusters.
These clusters have “fuzzy” boundaries, in the sense that each data value
belongs to each cluster to some degree or other. Membership is not certain,
or “crisp”. Having decided upon the number of such clusters to be used,
some procedure is then needed to location their centers (or more generally,
mid-points) and to determine the associated membership functions and the
degree of membership for the data points. Fuzzy clustering methods allow
for uncertainty in the cluster assignments. FCM is an iterative algorithm
to find cluster centers (centroids) that minimize a dissimilarity function.
Rather that partitioning the data into a collection of distinct sets by fuzzy
partitioning, the membership matrix (U) is randomly initialized according
to Equation (6.2).
Fuzzy c -Means (FCM) Clustering:
c
u ij =1 ,
j =1 , 2 , 3 ,...,n.
(6.4)
i =1
The dis-similarity function (or more generally the objective function),
which is used in FCM in given Equation (6.3).
c
c
n
u ij d ij ,
J ( U, c 1 ,c 2 ,...,c c )=
J i =
(6.5)
i =1
i =1
j =1
Search WWH ::




Custom Search