Biology Reference
In-Depth Information
tered incorrectly. Therefore, we define the accuracy as the percentage of points
that are correctly clustered by the algorithm. For the typical result, the accuracy
of IPROCLUS is 91.45% and that of PROCLUS is 70.46%. Both algorithms are
tested multiple times. On average, IPROCLUS and PROCLUS can achieve the
accuracy of 91.90% and 70.18% respectively. In a word, IPROCLUS can achieve
much better accuracy than PROCLUS for the scaled dataset in the random case.
Table 9.3.
Confusion Matrix for IPROCLUS and PROCLUS
IPROCLUS
Input
A
B
C
D
E
Outliers
Output
1
3
0
97
22893
39
272
2
1525
1
60
120
428
668
3
0
23854
162
0
463
555
4
1
4
25031
15
348
583
5
14800
0
311
65
422
393
Outliers
439
0
17
0
3902
2529
PROCLUS
Input
A
B
C
D
E
Outliers
Output
1
16626
0
2
406
366
260
2
2
22794
12
1346
461
356
3
6
1
2565
6953
1221
1098
4
3
0
19116
0
0
11
5
131
361
310
4803
1450
1181
Outliers
0
703
3673
9585
2104
2094
Then we consider the accuracy for the unscaled dataset in the random case.
The data used are generated in exactly the same way as in PROCLUS paper. The
two algorithms have compatible results. The average accuracy of IPROCLUS and
PROCLUS are 93.02% and 93.94% respectively.
Table 9.4 gives the average accuracy result for the two extreme cases. Similar
to the random case, IPROCLUS exhibits much better accuracy than PROCLUS
for the scaled datasets and compatible accuracy for the unscaled datasets.
Table 9.4.
The Accuracy Result for the Two Extreme Cases
Data
PROCLUS
IPROCLUS
All-same-dimensions scaled
75.27%
96%
All-same-dimensions unscaled
95.91%
94.57%
No-common-dimensions scaled
51.08%
87.75%
No-common-dimensions unscaled
87.67%
86.87%
In summary, IPROCLUS can achieve much better accuracy than PROCLUS
for the scaled data in all the three cases. When the unscaled data is considered,
Search WWH ::




Custom Search