Information Technology Reference
In-Depth Information
Definition 5 [Weighted symbol score]. Let W be such a weight matrix for
every pair of aligned sequences. Then the weighted symbol score for the aligned
sequences
S i
S j is defined as:
ˆ
WSS ( S i , S j )= W ij
M ( s i,k , s j,k )
k =1
Sequence weights can be determined by constructing a guide tree from known
sequences — this is the approach used in this paper. These definitions lead to
the most common faced sum-of-pairs multiple alignment problem: optimizing a
weighted sum-of-pairs function with ane gap penalties.
Definition 6 [Sum-of-pairs multiple alignment problem]
Let S =
{
S 1 ,S 2 ,...,S n }
be a set of n sequences over the alphabet Σ. Compute
the alignment S =
of S over the alphabet Σ that maximises the
weighted symbol score and the a ne gap penalty score for all aligned sequences S i :
S 1 , S 2 ,..., S n }
{
n− 1
AGP S ( S i )
n
n
WSS ( S i , S j )+
max
S
(1)
i =1
j = i +1
i =1
For multiple protein sequence alignment, the weighted sum-of-pairs with ane
gap penalties is a popular objective function included in many MSA packages.
The problem of finding the multiple alignment was investigated in [6] and [7], and
proved to be a NP-hard problem. However, the results presented in [7] was proved
using a not metric scoring matrix (zero distance between two identical residues),
which is different from the actual scoring matrices used in multiple alignments.
Therefore, in [6], the authors improved the previous investigation using a fixed
metric score matrix through a reduction from the Minimum Vertex Cover ,a
classical NP complete problem [8]. Multiple sequence alignment (MSA) decision
problems can be formulated as: given a set S =
of sequences, a
sum-of-pairs objective function, and an integer C . MSA checks for alignments of
S, which have value C or less.
{
S 1 ,...,S n }
3
Hybrid Clonal Selection Algorithm
This work presents a Clonal Selection Algorithm (CSA) [30] with new hypermu-
tation operators for solving the multiple sequence alignment problem. CSAs are
a special class of Immune algorithms (IAs) inspired by the human Clonal Se-
lection Principle [31]. They are effective methods for search and optimization in
real-world applications. The algorithm is population based where each individ-
ual of the population is a candidate solution belonging to the fitness landscape
of a given computational problem. It uses two different methodologies to create
the initial population, as well as new hypermutation operators which insert or
remove gaps in the sequences.
Gap columns which have been matched are moved to the end of the sequence.
Next the remaining elements (amino acids in this work) and existing gaps are
 
Search WWH ::




Custom Search