Advantages of Information Granulation in Clustering Algorithms - Agents and Artificial Intelligence - page 136

Information Technology Reference

In-Depth Information

The results presented in Table 3 consider the run time (in seconds) of the algorithms

examined on a one-off basis. This is the average time of 50 runs of the methods cal-

culated for clustering original data as well as hyperboxes. The last column of the table

contains the quotient of the values. It can be seen, that the processing of granulated

data is significantly (up to about 40 times in case of SOSIG and 14 times in case of

the remaining algorithms) faster than processing original point-type objects. The most

acceleration is visible when the number of objects in data is great and considerably

predominate the number of attributes.

Ta b l e 3 . Average time (in seconds) of clustering hyperboxes and point-type data

data

algorithm point-type data granulated data t pd / t gd

set

t pd

t gd

SOSIG

0.360

0.040

9

k-means

0.062

0.047

1.32

norm2D2gr

hcl

0.110

0.032

3.44

hsl

0.125

0.031

4.03

SOSIG

0.930

0.080

11.63

k-means

0.187

0.094

2.0

sph2D6gr

hcl

0.266

0.047

5.66

hsl

0.250

0.032

7.81

SOSIG

0.870, 0.800

0.790

1.01

k-means

0.141

0.125

1.13

irises

hcl

0.078

0.046

1.70

hsl

0.094

0.047

2.0

SOSIG

0.270

0.010

38.57

k-means

0.156

0.047

3.32

sph10D4gr

hcl

0.141

0.016

8.81

hsl

0.219

0.015

14.6

Comparing the results of clustering algorithms one can notice the most increased

speed for hierarchical algorithms and SOSIG. As it has been mentioned, hierarchical

algorithms arouse scientists' interest due to their better clustering ability in comparison

to less complex partitioning methods. However, their time complexity is greater. The

same applies to SOSIG. Processing granulated data in advance can be a way of enabling

them to cluster large size databases in reasonable time.

Obviously, the total time of clustering is influenced by the time of data preprocess-

ing, particularly when the algorithm of data preparation is complex. However, in the

experiments described in this paper this time is not taken into consideration for two rea-

sons. First of all, the number of objects in preparing a set is decreasing by one in every

iteration, which practically reduces the time complexity of pre-processing procedure. In

addition, in case of algorithms, which take a number of groups as an input parameter,

data should be clustered at least several times to evaluate the number of clusters present

in this data. In this case single preparation of data has significantly less importance in

comparison to multiple data clustering.

To compare results of clustering regarding the most compact and separable parti-

tioning two internal indices: DB and Dunn s have been chosen. In addition, external

Next Page

Agents and Artificial Intelligence

Search WWH ::

Custom Search

Home