Biology Reference
In-Depth Information
where
l GOP: user defined gap opening penalty
l N , M : length of sequences
ð
average residue mismatch score
l ISF: percentage of identity scaling factor
l GEP: user-defined gap extension penalty
S
a
;
b
Þ :
l
Even though the idea proposed in ClustalW is simple, neat, and
performs well, its main drawback is being too dependent on the
initial global alignments. Errors in the early alignment phases are
propagated and may lead to the exclusion of consistencies between
close pairs and distant ones. T-Coffee, which stands for T ree based
C onsistency O bjective F unction For Alignm E nt E valuation, is a
progressive alignment approach as ClustalW but aims to overcome
the aforementioned drawback [ 12 ]. T-Coffee starts with executing
ClustalW for global alignment and Lalign (a local alignment algo-
rithm [ 13 ]) for local alignment for all pairs of sequences and
chooses the top scoring alignments. This collection of global and
local alignments indicates two libraries and a weight is assigned to
each pair of aligned residues. The two libraries are merged into a
secondary library by assigning greater weight to pairs that match in
both alignments and creating new entries for those pairs that do not
match. For example, given the following two sequences:
3.2
T-Coffee
S 1 :
GARFIELDTHE LAST FAT CAT
S 2
:
GARFIELD THE FAST CAT
There are 18 residues in the S 2 sequence, two of which are not
matched. Hence, sequence identity is 100
¼
88 which is
the primary weight for this alignment. If this alignment also existed
in the second library, which is built using local alignments, then the
two alignments are merged into one and its new weight is 88
(16/18)
176 assuming the local alignment also has an 88 % identity.
If in the local alignment of these after the primary library is con-
structed, T-Coffee alters the pairwise alignment weights by con-
sulting a third sequence in order to improve the overall MSA at the
cost of reducing pairwise alignment scores. The importance of
incorporating a third sequence is illustrated as follows:
2
¼
S 1 :
S 1 ;i
S 2
:
S 2 ;k
S 2 ;j
Let us assume S 1, i aligns comparably well to both S 2, j and S 2, k .
Therefore, we are not sure which part of S 2 to align S 1, i to.
Search WWH ::




Custom Search