Information Technology Reference
In-Depth Information
measure R has been selected since there were a priori group labels available. This in-
dex is convenient to assess clusters quality as well as differences between partitioning
results as its range is in interval [0,1]. In general, validity indices are not universal.
However, this is the most popular tool for assessing clustering results [3]. Simultane-
ous comparison of several of them can give a quite objective result. The evaluation of
grouping results is shown in Tables 4 and 5.
Ta b l e 4 . Results of clustering of norm 2 D 2 gr and sph 2 D 6 gr set in form of point-type and
hyperboxes
norm 2 D 2 gr set
sph 2 D 6 gr set
algorithm
index
point-type granulated point-type granulated
data
data
data
data
R
0.96
0.98
0.99
0.99
SOSIG
DB
0.06
0.08
0.03
0.01
Dunn's
0.16
0.74
0.51
1.38
R
0.98
1.0
0.76
0.99
k-means
DB
0.07
0.09
0.05
0.01
Dunn's
0.26
0.50
0.06
1.33
R
0.98
1.0
0.87
0.99
hcl
DB
0.07
0.09
0.03
0.01
Dunn's
0.26
0.50
0.51
1.33
R
0.50
1.0
0.50
1.0
hsl
DB
0.07
0.09
0.03
0.01
Dunn's
0.26
0.50
0.51
1.33
When studying values of the indices it can be noticed, that in the most cases data
granulation did not influence negatively the condition of clustering. Clusterings of norm
2 D 2 gr , sph 2 D 6 gr and sph 10 D 4 gr in form of hyperboxes performed by all of the al-
gorithms are characterized by comparable or better values of the internal indices. In
case of R index there can be also noticed increase of quality (up to 50%) for hyper-
box results. For irises set the values of the internal indices are better for point-type
clustering. However, for this type of input data R index is smaller for hcl and k-means
algorithms.
Table 6 contains detailed description of groups detected in clustering of irises hy-
perbox data. The final result is composed of 10 clusters. However, due to considerable
differences in their size the result focuses on the main 3 granules. The apriori de-
cision attribute is composed of 3 classes: Iris-setosa (I-S), Iris-versicolor (I-Ve) and
Iris-virginica (I-Vi). The set is described by 4 attributes: sepal-length (SL), sepal-width
(SW), petal-length (PL) and petal-width (PW). The granule gr 1 contains 13 smaller
granules (hyperboxes) and all of them belong to class Iris-setosa. The other granule
( gr 3 ) has comparable size (15 objects) and contains only objects from Iris-versicolor
class. The largest granule gr 2 consists of 36 hyperboxes. It is not homogenous with
respect of class attribute due to 31% of the objects come from Iris-versicolor class and
69% from Iris-virginica.
Attention has to be focused on the attributes resulted from doubling of dimensions.
These features are related to minimal and maximal values of the original attributes. As
 
Search WWH ::




Custom Search