Objective Functions - Multiple Sequence Alignment Methods

Biology Reference

In-Depth Information

where

l GOP: user defined gap opening penalty

l N , M : length of sequences

ð

average residue mismatch score

l ISF: percentage of identity scaling factor

l GEP: user-defined gap extension penalty

S

a

;

b

Þ :

l

Even though the idea proposed in ClustalW is simple, neat, and

performs well, its main drawback is being too dependent on the

initial global alignments. Errors in the early alignment phases are

propagated and may lead to the exclusion of consistencies between

close pairs and distant ones. T-Coffee, which stands for T ree based

C onsistency O bjective F unction For Alignm E nt E valuation, is a

progressive alignment approach as ClustalW but aims to overcome

the aforementioned drawback [ 12 ]. T-Coffee starts with executing

ClustalW for global alignment and Lalign (a local alignment algo-

rithm [ 13 ]) for local alignment for all pairs of sequences and

chooses the top scoring alignments. This collection of global and

local alignments indicates two libraries and a weight is assigned to

each pair of aligned residues. The two libraries are merged into a

secondary library by assigning greater weight to pairs that match in

both alignments and creating new entries for those pairs that do not

match. For example, given the following two sequences:

3.2

T-Coffee

S 1 :

GARFIELDTHE LAST FAT CAT

S 2

:

GARFIELD THE FAST CAT

There are 18 residues in the S 2 sequence, two of which are not

matched. Hence, sequence identity is 100

¼

88 which is

the primary weight for this alignment. If this alignment also existed

in the second library, which is built using local alignments, then the

two alignments are merged into one and its new weight is 88

(16/18)

176 assuming the local alignment also has an 88 % identity.

If in the local alignment of these after the primary library is con-

structed, T-Coffee alters the pairwise alignment weights by con-

sulting a third sequence in order to improve the overall MSA at the

cost of reducing pairwise alignment scores. The importance of

incorporating a third sequence is illustrated as follows:

2

¼

S 1 :

S 1 ;i

S 2

:

S 2 ;k

S 2 ;j

Let us assume S 1, i aligns comparably well to both S 2, j and S 2, k .

Therefore, we are not sure which part of S 2 to align S 1, i to.

Multiple Sequence Alignment Methods

Search WWH ::

Custom Search

Home