Information Technology Reference
In-Depth Information
Table 2. Iris dataset matching matrix obtained with k -means using the same metric
than in our proposed algorithm, i.e. Euclidean distance
Iris setosa
Iris versicolor
Iris virginica
Iris setosa
50
0
0
Iris versicolor
0
48
2
Iris virginica
0
14
36
Fig. 8 shows the chainmap of the Iris dataset. In this case it can be clearly
distinguished 3 local maxima, each of one corresponding to an individual cluster.
6
5
4
3
2
1
0
0
50
100
150
Fig. 8. Chainmap of the Iris dataset
4 Conclusions and Further Work
A novel algorithm for data clustering based on linear cellular automata has been
proposed. The method identifies the individual data items as cells belonging to an
uni-dimensional cellular automaton and it is inspired in both social segregation
models and also Ant Clustering algorithms.
The results obtained as synthetic as real datasets improve significantly the
ones obtained with conventional unsupervised methods such as the k -means
algorithm.
Although the data items are correctly ordered in the tape, still remains the
post-processing task of finding the real clusters of the dataset. As we have used
chainmaps diagrams formed by the distances of the successive data items, that
issue corresponds to analyze the diagram and to select the optimum threshold
that gives the correct clustering solution for each particular dataset.
Search WWH ::




Custom Search