Biology Reference
In-Depth Information
Fig. 16.2. Comparison of intra-cluster error sum from the clustering of 5652 yeast genes based on
DNA expression levels in glucose pathway experiments, using different clustering algorithms. Each
gene contains 36 time points, or a 24-dimensional feature vector. The intra-cluster error sum measures
the extent of dissimilarity between objects within the same cluster, and should be minimized.
tered. For convenience, the comparison involving SOM and SOTA will only be
carried out at the optimal cluster number predicted for the EP GOS Clust. Since
the K-family of clustering approaches are sensitive to the initialization point, we
run each 25 times and use only the best result.
16.3.2. Intra-cluster Error Sum
Data points in the same cluster should be as similar as possible; hence the intra-
cluster error sum should be minimized. From Fig. 16.2, it can be seen that the
best performing clustering algorithms are the K-Medians and the EP GOS Clust.
In fact, other than within regions of low cluster number, the EP GOS Clust out-
performs all the other algorithms. One reason for the efficacy of the K-Medians at
low cluster numbers is due to it using the data median to compute cluster centers.
This circumvents the distorting effects of outlier data points, which particularly
affects algorithms that use random initialization points, such as K-Means. It is
also notable that the GOS I performs admirably even though the pre-clustering
allows genes with up to 30% difference in feature points to be grouped together.
This reflects the rigor of the subsequent steps in assigning suit ij parameters, the
GOS clustering, and the process of incrementing the cluster number. The clus-
tering results also show up the inadequacy of QTClust. It groups genes together
till the cluster reaches a pre-determined tolerance. The algorithm then determines
the number of clusters to use. A different tolerance criterion needs to be speci-
fied in order to obtain a different cluster number. This implies that the process
Search WWH ::




Custom Search