Information Technology Reference
In-Depth Information
5.1
Description of Datasets
There are several data sets in the experiments, shown in Table 1. There are synthetically
generated ( norm 2 D 2 gr , sph 2 D 6 gr , sph 10 D 4 gr ) and real data ( irises ). The sets are
various with regard to the number of objects, dimensionality and the existed number of
groups. Column number of groups contains the number of clusters present in the data
according to the subjective human perception based on the separation and compactness
of the groups. However, the irises data set contains real data delivered with a priori
class attribute. For this reason the value of group number for this data is related to the
number from the decision attribute.
Ta b l e 1 . Data sets used in the experiments
data
number of number of number of number of
set
dimensions
points
hyperboxes
groups
norm2D2gr
2
200
51
2
sph2D6gr
2
300
70
6
irises
4
150
94
3
sph10D4gr
10
200
13
4
5.2
Results of Experiments
Algorithm SOSIG detects a number of clusters automatically. The number of groups
identified this way in described above data sets is presented in Table 2. When the result
consists of groups of highly variable sizes, only a number of main groups is presented
there. Partitioning of irises set contains two levels (low and high resolution), which is
visible in all the following tables. In the result there are 2 clusters when granulation is
performed on low resolution level, whereas in high resolution level one large cluster is
split in two smaller ones and additionally, there are 5 significantly smaller groups. The
results considering both levels of granulation are shown in the same cell of the tables
where the first value corresponds to low and the second to high level of resolution. Clus-
tering of irises hyperboxes is composed of only one level with 4 main and 6 additional
smaller groups. In clustering results of the remaining data sets the number of groups
corresponds to each other for both types of processed data.
Ta b l e 2 . Results of clustering of point-type and granulated data with respect to the number of
identified groups
number of groups
data
point-type granulated
set
data
data
norm2D2gr
2
2
sph2D6gr
6
6
irises
2, 3
4
sph10D4gr
4
4
 
Search WWH ::




Custom Search