Information Technology Reference
In-Depth Information
shifted into the freed space. The designed CSA considers only two immunolog-
ical entities: antigens (Ags) and B cells. The Ag is the problem to solve, i.e. a
given MSA instance, and B cells are the candidate solutions, i.e. a set of align-
ments, that have solved (or approximated) the initial problem [32,33]. Tackling
the multiple sequence alignment problem Ags and B cells are represented by a
sequences matrix.
Let Σ =
be the al-
phabet, where each symbol represents twenty amino acids and let S =
{
A, R, N, D, C, E, Q, G, H, I, L, K, M, F, P, S, T, W, Y, V
}
{
S 1 ,S 2 ,
...,S n }
, such that S i
Σ . Therefore, an Ag is represented by a matrix of n rows and max
be the set of n
2 sequences with length
{
1 , 2 ,..., n }
{
1 ,..., n }
) matrix was used, with =( 2
columns, whereas for the B cells a ( n
×
·
max
) . These values where taken from experimental the proposed al-
gorithm was able to develop more compact alignments .Inparticular,fortheB
cells a binary matrix was used, where s i,j
{
1 ,..., n }
= 0 refers to a gap in the alignment
and s i,j = 1 to a residue with 1
i
n and 1
j
.
A Initialize the Population
Two different strategies were used to create the initial population ( t =0)of
candidate alignments. The first strategy, random initialization , is based on the
use of random “ offsets ” to shift the initial sequences in the following way: an
offset is randomly chosen in the range [0 , (
i )] by a uniform distribution and
then the sequence S i is shifted from an offset positions towards the right side of
the row i, of the current B cell.
A second way to initialize the population was analyized, seeding the initial
population with CLUSTALW and CLUSTALW-seeding . However, a percentage
of the population was initialized using the offsets strategy described above to
avoid the algorithm getting trapped in a local optima. Hence, the second strategy
creates a percentage of initial alignments using CLUSTALW and the remaining
alignments are determined by a random offsets creation.
Preliminary experimental results show that the proposed algorithm achieves
better performance using the second strategy. Therefore, all results shown in
this paper were obtained using a combination of the two previously introduced
strategies (80% of B cell population by CLUSTALW seeding and 20% of B cell
population by random initialization using the random offsets).
The presented hybrid IA incorporates the classical static cloning operator ,
which clones each B cell dup times producing an intermediate population P ( clo )
N c
of N c = d
dup B cells, where d is the population size).
The basic mutation processes which are considered in pairwise alignment and
multiple sequence alignments are: substitutions which change sequences of amino
acids, as well as insertions and deletions which add or remove amino acids and/or
gaps. In a first version of the algorithm the classical hypermutation and hyper-
macromutation operators where used: first operator flips a bit, using a number of
mutations inversely proportional to the fitness function value [34], whereas the
hypermacromutation simply swaps two randomly choosen subsequences. How-
ever, the first experiments produced non optimal alignments obtained, leading
×
 
Search WWH ::




Custom Search