Automatic Teaching–Learning-Based Optimization: A Novel Clustering Method for Gene Functional Enrichments - Computational Intelligence Techniques for Comparative Genomics

Biomedical Engineering Reference

In-Depth Information

Keywords Automatic clustering

Teaching

learning-based optimization

Gene

-

functional enrichments

Cluster validity indices

1 Introduction

Evolutionary algorithms (EA) are generic meta-heuristic optimization algorithms

that use techniques inspired by nature

s evolutionary processes. EA maintains a

whole set of solutions that are optimized at the same time instead of a one single

solution. The inherent randomness of the emulated biological processes enables

them to provide good approximate solutions nevertheless. The recently emerged

nature-inspired multi-objective meta-heuristic optimization algorithms teaching

'

-

learning-based optimization (TLBO) [ 1 , 2 ] and its variations Elitist TLBO [ 3 , 4 ]

belong to this category. Both these algorithms aim to

-

nd global solutions for real-

world problem with less computational effort and high reliability. The principle idea

behind TLBO is the simulation of teaching

learning process of a traditional

classroom in to algorithmic representation with two phases called teaching and

learning. Elitist TLBO was pioneered with a major modi

-

cation to eliminate the

duplicate solutions in learning phase.

Clustering is the subject of active research in several

elds such as statistics,

pattern recognition, machine learning, data mining, and bioinformatics. The pur-

pose of clustering is to determine the intrinsic grouping in a set of unlabeled data,

where the objects in each group are indistinguishable under some criterion of

similarity. Clustering is used to partition a dataset into groups, so that the data

elements within a cluster are more similar to each other than data elements in

different clusters. Automatic clustering addresses the challenge of determination the

appropriate number of clusters or partitions mechanically.

Most of the existing clustering techniques, based on EA, accept the number of

classes (k) as an input instead of determining the same on the iteration. Never-

theless, in many practical situations, the appropriate number of groups in a previ-

ously unhandled dataset may be unknown or impossible to determine even

approximately. To avoid the algorithm struck in such blockage, automatic assign-

ment of (k) value by the algorithm in each run was made tangible in this work.

These automatic clusters are again endorsed with cluster validity indices (CVIs),

which combine compactness and separability for assessing the quality of clusters.

Cluster validity criteria are of three types external, internal, and relative. External

indexes require a priori data for the purposes of evaluating the results of a clustering

algorithm, whereas internal indexes do not. Internal indexes evaluate the results of a

clustering algorithm using information that involves the vectors of the datasets

themselves. The relative index evaluates the results by comparing the current

cluster structures with other clustering schemes. The CVIs that are used in this work

are rand index (RI) [ 5 ], advanced rand index (ARI) [ 5 ], Hubert index (HI) [ 6 ],

Computational Intelligence Techniques for Comparative Genomics

Search WWH ::

Custom Search

Home