Biomedical Engineering Reference
In-Depth Information
Table 1 Comparison of classi ers
Classi er
# Correct classi ed samples
Correct classi cation rate (%)
K-Means with canopy
172
79.2
Fuzzy C-Means with canopy
186
84.8
We have made an attempt to show the difference in execution times on single node
and multi nodes along with the comparison between Hard Clustering
(K-Means) and Soft Clustering (Fuzzy C-Means) techniques. The experimentation
is carried out using different size data points of 1,000, 100,000, 1,000,000, and
10,000,000 records. The comparison is also made to demonstrate the number of
classi
cation with respect to both
K-Means and Fuzzy C-Means algorithms with canopy in Table 1 . With the results in
Table 1 , it is clear that the FCM with canopy is effective than K-Means with canopy.
The experimentation is done on Ubuntu 12.10, Hadoop 0.20.1 and Mahout 6 envi-
ronment using Java7.
In Fig. 2 , the experimentation is done using different size datasets and the time to
complete the clustering process is analyzed. The graph shows the comparison
between K-Means with canopy and Fuzzy C-Means with canopy techniques. It is
observed that the time taken to cluster the data is almost equal for smaller datasets.
But as the size of the dataset increases, decrease in the time taken is reduced for
Fuzzy C-Means technique than the K-Means techniques.
Hence, from the Fig. 3 and Table 1 , it is evident that the proposed method Fuzzy
C-Means with canopy technique is more ef
ed samples and the correctness of the classi
cient than the K-Means with canopy
technique.
Fig. 3 Graph showing the
number of documents versus
time required to process
Search WWH ::




Custom Search