Information Technology Reference
In-Depth Information
Aligning Multiple Protein Sequences by
Hybrid Clonal Selection Algorithm
with Insert-Remove-Gaps and
BlockShuing Operators
V. Cutello 1 ,D.Lee 2 ,G.Nicosia 1 ,M.Pavone 2 , and I. Prizzi 3
1 Department of Mathematics and Computer Science
University of Catania
Viale A. Doria 6, 95125 Catania, Italy
{ vctl, nicosia, mpavone } @dmi.unict.it
2 IBM-KAIST Bio-Computing Research Center
Department of BioSystems, KAIST
373-1, Guseong-dong, Yuseong-gu, Daejeon, Republic of Korea
dhlee@biosoft.kaist.ac.kr , mario@biosoft.kaist.ac.kr
3 Diogenes Research Center, Catania, Italy
prizzi@crsdiogenes.it
Abstract. Multiple sequence alignment (MSA) is one of the most im-
portant tasks in biological sequence analysis. This paper will primarily
focus on on protein alignments, but most of the discussion and method-
ology also applies to DNA alignments. A novel hybrid clonal selection al-
gorihm, called an aligner, is presented. It searches for a set of alignments
amongst the population of candidate alignments by optimizing the classi-
cal weighted sum of pairs objective function. Benchmarks from BaliBASE
library (v.1.0 and v.2.0) are used to validate the algorithm. Experimental
results of BaliBASE v.1.0 benchmarks show that the proposed algorithm
is superior to PRRP, ClustalX, SAGA, DIALIGN, PIMA, MULTIALIGN,
and PILEUP8. On BaliBASE v.2.0 benchmarks the algorithm shows in-
teresting results in terms of SP score with respect to established and
leading methods, i.e. ClustalW, T-Coffee, MUSCLE, PRALINE, Prob-
Cons, and Spem.
Keywords: bioinformatics, multiple sequence alignment, protein sequ-
ences, immune algorithms, clonal selection algorithms, hypermutation
operator.
1
Introduction
Proteomics Multiple Sequence Alignment (MSA) plays a central role in molecular
biology, as it can reveal the constraints imposed by structure and function on
the evolution of whole protein families [1]. MSA has been used for building
phylogenetic trees, identification of conserved motifs, and predicting secondary
and tertiary structures for RNA and proteins [2].
Search WWH ::




Custom Search