Biology Reference
In-Depth Information
the alignment and further optimize its quality. Here the rationale is
to integrate predicted structural information into the alignment,
following the principle that protein structural aspects tend to be
more conserved than the associated sequences during evolution.
PRALINE incorporates secondary structure and/or TM informa-
tion by using specific residue exchange matrices during alignment.
PRALINE is available as an online server (URL: http://www.
ibi.vu.nl/programs/PRALINEwww/ ) , which is also equipped
with a SOAP service, allowing the users easy access to the Web
service from within their own programs or scripts.
2 Method
PRALINE employs a profile-based progressive alignment strategy.
As stated above, after initial all-against-all pairwise alignment, the
highest scoring sequence pair is joined into the first sequence block.
Then, this sequence block is aligned with all the remaining single
sequences, after which the highest scoring pair is selected. Note
that at this stage, the highest scoring alignment can be between the
sequence block and a single sequence, while at a later stage also
alignment of sequence blocks may occur. Alignment proceeds until
all sequences have been aligned in a single MSA. By following this
protocol, PRALINE does not utilize a precomputed guide tree in
its alignment protocol, but calculates the guide tree on the fly by
utilizing the information afforded by pre-aligned blocks at each
stage, such that the tree reflecting the progressive alignment steps
becomes available at the end. Since successive profile scores during
the PRALINE progressive protocol descend uniformly, they can
be used to construct a dendrogram reflecting the alignment order.
Alignment in PRALINE is carried out using the dynamic pro-
gramming technique [ 7 ]. The following simple profile-scoring
scheme is used to score a pair of profile positions (columns) x and y :
2.1 The “Core” MSA
Protocol in PRALINE
;
X
X
20
20
j α
P ij
P i P j
Score
ð
x
;
y
Þ¼
β j log
(1)
i
i
where
β j are the frequencies with which amino acids i and j
appear in columns x and y , respectively, and M (i, j) is the exchange
value for amino acids i and j according to substitution matrix M
(e.g., BLOSUM62 [ 12 ] or PAM250 [ 13 ]).
PRALINE adopts a semi-global alignment strategy, which
means that it aligns sequences over their whole length, but without
penalizing the so-called end gaps, i.e., gaps occurring N- or
C-terminally to any of the sequences. Global alignment strategy is
known to be optimal for sequences of high-to-medium sequence
similarity. Since interesting biological alignments can have
sequences that diverged considerably beyond the level that can
α i and
Search WWH ::




Custom Search